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ABSTBACT 

in order to identify and explain the reasons for 
differences in average on-line search rates among terminal 
installations, the operations and performance v^I o^veral facilities 
using the Lockheed DIALOG system for on-li,ne searching of the ERIC 
data base were studied. Detailed examinations were made of such 
aspects as the DIALOG system response time as a function of the time 
of day or day of the week; the search commands and logic used by each 
of the terminal installations for their operations; the "i* of 
.complex, medium, or simple questions processed at each terminal 
location; and the extent and impact of the variant forms of 
descriptors in the file (e^g., singulair and plural forms of the same 
term). Timing studies were performed tp suggest some terminal 
procedures that could increase average on-line search speeds. 
Guidelines for searchers to consider for pre-search and terminal 
activities are presented at the end of the study. (A|ithor/PF) 
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AiiSTRACT 



This research study examined the operations and performance of 
several facilities that were using the Lockheed DIALOG system for on-lij|e 
setirchittg of the ERIC data base. ^The study was done with the objective }, 
of identifying the factors that significantly influenced the productivity 
of terminal use. Detailied ex ..ations were made of such aspects as the 
DIALOG system response time , function of the time of day or day of 
the week; the search commands &nd logic used by each of the. termin&l 
installations foi^ their operstions; the mix of complex, medium or simple 
questions processed at each t*erminal location; and the extent and impact 
ot the variant forms of descriotors in the file (e.«., singular knd plural 
forms of the same term). Guidelines were prepared for the searchers to 
consider for pre-search and terminal activities. Timing studies were per- 
formed. suggest some terminal procedures that could increase average on- 
line search speeds. « 
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I . INTRODUCTION 



f- THE PROBLEM 

Since ^969 the Information Sciences Leboretory of the Lockheed 
P^lo Alto Research laboratory has. offered on^^line computer searching 
of the Educational Resources Information Center (ERIC) data base. 
I>uri;ng the past three years there has been a growing number of organi- 
sations conducting ERIC searches on-line using terminal equipment linked 
via telephone connectiSns to the Lockheed computer facility in Palo Alto, 
California* Several installations have been heavy users of the service 
over a long period o^ time. 

In studying the searching activity of the various ERIC service 
subscribers over a period of many months , it has been observed that wide 
variations exist among installations with respect to the average number 
of ERIC searches processed per unit of terminal time. Certain organiza- 
• tions consistently conduct motfe on-^line searches per hout on their ter*^ 
min^ls than do others. Some installations consistently realize close to 
three times as many searches per hour as some other installations. This 
seems rather surprising considering that each installation searches the 
^ame data base, uses similar terminal equipment, receives similar instruc- 
tion, uses similar support tools, and is served by the same central facility. 

The extent of the variation in average search time among different 
organizations is shown in Figure 1, where the average search rate is given 
for 11 different terminals that actively used the ERIC/DIALOG system at 
some time during the period August 1972 through September 1974. (Not all 
Qf thene organizations were still using ERIC/DIALOG during the Fall of 1973, 
the period examined most intensively during this investigation.) Some of 
these data were initially published in issues of the ERIC/DIALOG Chronolog , 
and are summmarized in Table I. It may be seen that ^while the average search 
rate fluctuate^ substantially from month to month, there. are several instal- 
lations that consistently process^ more questions per terminal hour than do > 
several of the others. Thase insiballations, over a period of many months/ 
have a typicail werage search rate <0f 9 or 10 questions per hour or more 
compared with 3 or 4 questions per hour for some of the other terminals. 
Note also that there has been a dramatic improvement in the search speeds 
for many of the installations since the early days of ERIC/DIALOG operation. 
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H. OBJECTIVES 

The primary objective of this study was to identify and efkplain 
the reasons for the differences in average on-line search rates between 
different terminal installations. An additional objective was the de- 
velopment of a set of" guidelines for condtcting ERIC searches on-line 
via the Lockheed facilities that would enable the user to carry out the 
searching process in a more eff-i!cient and effective manner.- 

We were interested in identifying the factors that significantly 
influenced terminal productivity, with a view to establishing guidelines 
or suggested procedures that would permit the terminal operators to make 
the best use of their available facilities, and would provide some sugges- 
tions for additional system Improvements. 



Although the primary objectives vere focused on searching the ERIC 
data base with the DIALOG system, it is expected that some of the findings 
will be applicable to other ERIC search systems and to other on-line search 
systems. With over 100 terminal installations presently searching the ERIC 
data base, and with more' being added every day, this seemed to be a tropic of 
increasing interest to a large number of organisations. 
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II. SUMMARY 



f The first objective of this study was to investigate various factors 
that Gfpuld possibly influence the productivity of ERIC/DIALOG on-line searching 
and that might explain the wide difference in search productivity that was 
experienced by various terminal installations. Considering only the ERIC 
data base and the DIALOG search aystemi we have identified many different 
factors that could influence the rate at which searches were done at such 
terminals. Such factors include: 

• computer and communication system loading (possibly reflected in 
different systan speeds and response times for various hours of 
the day or days of the week) 

• keyboard or typing skills of the searchers 
. complexity of the questions being searched 

• characteristics and speed of the temrinal equipment used 

. work habits and search formulation style of individual searchers 

• extent of pre-seatch' planning and work that is done before using 
the terminal- I - 

« availability and us^ of printed analyst reference tools 

/ 

• extent of use of S^kRCV. SAVEs or. other prior search efforts 

\ extent to which tl^e terminal equipment is being used as an output 
device / 

. extent of use of /^operating shortcuts with the DIALOG system 

• extent of relev^t continuing education and association with 
other searcher^ 

• subjejct expertise of the searcher and the installation 

/ 

• cost-cjonsciou^ attitude of the r>earcher and the installation 

. degree of us^r versus intermediary searching, and extent of user 
involvement /in the on-line ' interaction 



Data was collected durinj? this study that focused on several of 
these factors, howeyer we do not really know the impact or influence of 
all of these factor^* 

We have come to the conclusion that there is no single dominant 
factor that strongly influences the search speeds. We see instead a 
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complex pattern of many influences at work. At this time we cannot even 
rank all of these factors in terms of their relative influence on search 
speeds, and feel confident about that ranking. However we can sift out 
and identify some of the most important, and some of the lea^t important 
factors. . ' 



were: 



were: 



The factors that seemed to have little Influence on search speeds 

. computer system loading (and time of day or day of the week). 
Timing tests showed very little difference in search speed 
for various times of the day, or days of the week. ^ 

. keyboard or typing skills of the searcher / 

' \ Factors that seemed to have the most influence on search speeds 

\ " . • 

. work habits and search formulation styles (e.g. question complex- 
ity, recall/precision goals) of the searchers 

.extent of pre-search preparation for each question 

. cost-conscious attitude of searchers * 



\ 



searcher familiarity with the data base and operating skill with 
the on-line system (i.e. how good a "driver," and how many shortcuts 
does the searcher know and use?) 

. characteristics and speed of the terminal equipment 

Unfortunately only the last of these factors is supported by any 
solid evidence from this study. The other factors are on the list P;;«-">^rlly 
on the basis of observation, personal experience, and discussions with other 
searchers; however we are confident in their selection. The remaining 
factors on the initial list have some influence, ^ut to an extent yet to 
be determined. 

The second objective Of this study was to develop some guidelines 
to help improve the performance of ERIC/DIALOG searching. As a result 
of some t^ng exercises and other controlled experiments, and discussions 
with searchers at other installations, specific performance-improving 
practices were suggested with regard to such aspects as: 

. deciding when to do a search manually instead of on-line 
. handling variant forma of subject terms 
. use of the various analyst support tools 
. use of the SEARCH SAVE feature 
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ways to limit search output 

I 

initialisation procedures 
ways to SELECT De'f criptors 

deciding how inany\erms to use to adequately describe a search 
topic. ^ 
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A literature search was undertaken to determine whether evalua- 
tion methodologies had been 8ttggeste<i^or used which could be applied to 
our study of the productivity ot on-line terminals accessing the ERIC 
data base through Lockheed's DIALOG. Relevant citations are given at 
the end of this section. 

A. IMPORTANT MONOGRAPHS ' . 

Anyone beginning an evaluation of an information retrieval system 
should be familiar with the following landmark monographs in the field. 

Lancaster's Information Retrieval Systems. Characte ristics, 
Testing and EvaluatTon (1968). is an outstanding book, which received 
the American Society for Information Science's award for best information 
sciences book oi the year in 1970- It is concerned primarily with in- 
tellectual factors that significantly affect the performance of informa- 
tion retrieval systems (batch or on-line): indexing policy and practice; 
vocabulary control; searching strategies; interaction between the system 
and its users : Recall and precision are se^ectfed as the most important 
measures of system performance. 

Chapter 2 of King and Bryant's the Evaluation of Information Services 
a nd Products discusses kinds of measures that can be used to evaluate in- 
formation retrieval systems, including recall, precision, fallout and gener- 
ality ratios, and total retrieval. Measures and techniques suitable for 
"macroeval nation" (gross measures of input, output and effectiveness, made 
at minimum cost, suitable for support of decisions affecting funding and 
administration), and "microevaluation" (identification and diagnosis of 
failing components of information systems) are presented in Chapters 3 



Lancaster's Information Retrieval On-Line (1973), particularly 
Chapters 8 and 9, describes a number of systems currently in operation, 
discusses evaluation methodologies, and summarizes several evaluative 
studies don? to date. Several chapters contain extensive bibliographies. 



and 4. 




B. BIBLIOGRAPHIES AND REVIEWS ^ 

There is a great deal of literature about on-line ihfomation 
retrieval tyatems. Much of it is devoted to deacriptions of specific 
systems and their features . No atteapt haa been made to indijie this 
literature in the present compilation, since it is included if several 
published bibliographies. There is also a quantity of literattire on 
the evaluation of information retrieval systems (not necessarily 6n-line 
sytftems). Some general papers have been included in addition to the 
evaluation studies listed in the next section. 

In recent years , the >u8er interface has been the subject lof con- 
siderable attention. John Bennet's chapter of the .ASI§ Annual Rieview 
of Information Science and Technology for 1972 deals comprehensively 
with "The User Interface in Interactive Systems". Thomas Martinis 
chapter for the 1973 ASIS Annual Review on the same subject, focuses 
primarily on the conceptual aspects of interaction. The 1971 AF^PS 
workshop proceedings, Interactive fibliographic Search ; the 
Computer Interface , edited by Don Walker, contains an extenalve ciiassi- 
fied bibliography. ■ \ 

A bibliography on Evaluation of Document Retrieval Systems; Vover- 
ing literature published up to early 1968 was published by WiederJ*hr. 
and Beth Krevitt published a bibliography on Evaluation of Ihformaition 
Systems that covered the 1967-1972 literature. 
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C. PRIOR EVALUATION STUDIES 



Wiederkehr- (pg. 1^.) discussed the appropriateness of measures 
of effectiveness , and of efficiency as follows: 

"An appropriate measure to be used as a criterion for evalu- 
ating an information retrieval system should account for both how 
effectively the objectives are being met as well as how efficiently 
resources are being used. Consequently, it is desirable to have 
measures of effectiveness, such as how many useful documents were 
retrieved, and jneasures of efficiency, such as the cost and time. 
Recall andi. precis ion only partly satisfy this desire. 

In the research and development phase of any sys^^«m, the 
primary objective is. to demonstrate the technical feasibility of 
• *the system. Accordingly, effectiveness is of prime importance and 
efficiency is often ignored;; Once the technical feasibility of 
the system has been proveriV^the objective shifts to^demonstrating 
the economic feasibility of the system. In mo,st operating systems 
economic 'feasibility is of prime importance, in which case both 
the effectiveness and the efficiency should be taken into account. 

* Since most efforts to date concerning the evaluation of in- 
formation retrieval systenns have treated systems in the research 
and development phase, most of the measures considered have been 

' measures of effectiveness, such as recall and precision. However, 
as the systems become operational on a large scale, measures of 

* \ efficiency and overal-1 measures which account for both effective- 

1, ness and efficiency are anticipated." 

Our study was concerned primarily with measures of productivity and 
ef f icielpcy. 

A; number of evaluations of on-line systems have been carried out,. 
Most of ihese focused on measures such as recall and precision; most 
dealt with effectiveness of search formulation and with human factors 
such as ttaining required, attitudes, or frustration. Few have dealt 
with efficiency or productivity as a primary factor: effectiveness has 
been much more the focus.. 

Lancaster suggests that there are two basic ways of collecting 
evaluative faaterial, which may be used separately or in combination: 
through a series of survey forms (e.g., pre- and post-retrieval question- 
naires), or through the terminal itself — using computer data collection 
techniques. (Lancaster, 1973, p. 157.)^ 

Table 2 summarizes the aspects covered, and the techniques used, 
in prior evaluation studies of ort-line bibliographic searching systems. 
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D. OTHER STUDIES 

Several of the studies included in Table i devote considerable 
attention to specific system features. Other papers that discuss system 
design feature* in more general t^rms are those by Back (1972), Bennett 
(1971), and Martin (1974). Martin's paper compares specific system feature^s 
of 11 on-line searching systems. The report is aimedat system designers, 
and is not intended to be a system selection guide. This report is the 
third in a sequence %»hich began with the AFIPS workshop (Walker) and con- 
tinued with the AFIPS/ASIS worksho|? (Martin, JASIS 1973). Fife's review 
paper discusses over 50 technical features of some 46. interactive informa- 
tion systems. This paper is designed to help with state-of-the-art assess- 
ments prior to system sclectioit. System features are alsd, discussed in 
Interactive Bibliographic Systems , where papers and discussions from a 
1971 conference are presented. 

Cooper has proposed a measure of effectiveness involving a weighted 
output and the number of citations the user desired. Tell and Williams 
discuss the inverse weighting (value) of index terms according to their 
frequency of use (e.g., specific terms with few postings are associated 
with high weights; terms with many postings, with low weights), i 

A few articles dealing specifically with ERIC though not neces- 
sarily in an on-line context, have been included. Fry and Tell mention 
the quality of RtE material; Jewell discusses search strategies specifi- 
cally for the ERIC data base. 

Although Mittman, Treu, the SUPARS group, and others have kept 
logs of terminal activity, nothing was found in this literature search 
which was directly applicable to our primary purpose: to investigate 
the productivity, or factors affecting the productivity, of several 
different terminals accessing a central on-line data base. 

Annotated citations for the major papers identified in this liter- 
ature search are given in the next section. 
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E. "OTHER WORK IN PROGRESS 

Several studies are currently in progress at other locations 
.that may provide additional findijngs that are relevant to this effort. 
Dave Penninan of the Battelle Membrial Institute is presently completing 
his dissertation at Ohip State University on a topic related to on-line 
bibliographic searching systems I *i As part of, that study he has performed 
a detailed statistical study of the terminal transactions for Battelle 's 
BASIS system. The analysis included time distributions of U major 
functions performed at, the terminul (e.s>., initialize^ search index, 
formulate logic, printl) for each terminal and each of several major data 
bases. This work j^as reported at ,the 1974 ASIS annual meeting but has 
not been published yet. 



The Biosciences Information 



Service (BIOSIS) has been runnihg an 

experimental on-line search service on the BIOSIS tapes, using th« STAIRS 
software in conjunction with 26 terminal installations of the SUNY Bio- 
medical Coianuhica'tions Network. Ai part of this experiment, a series of 
20 test search statements was givei to each installation for a controlled 
test of searching performance. Be:ause the same 20 questions were given 
to each installation, it will be possible to analyze the different approach 
taken by 26 different analysts to code and run the same information re- 
quested. This should provide useful indications of the variability of 
analysts' approaches to the same problem. The experiment was discussed . 
• briefly at the 1974 ASIS meeting bj( Kay Durkin and is discussed in the con- 
ference proceedings. However, the full report of the research results is 
not expected to be available until sometime in 1975. 
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Atherton, Pauline A., K. H. Cook and J. Katzer. Free Text Retrieval 
Evaluation . Syracuse, N.Y.: Syracuse University, School of 
Library Science, 1972. 

An extensive series of SUPARS experiments was conducted in 
1971-72. Reaction to tKe system was evaluated by 63 telephone 
interviews administered to random samples of users and non-users, 
. ^a-««inantic differential (described , in the 1972 Katzer article"^) 
and an analysis of requests for help made through a telephone 
aid service Which was available to searchers at th^ terininal. 
V: " . . 

Back, Harry B. "What Information Dissemination Studies Imply Concerning 
the Design of On-Line Reference Retrieval Systems." Journal of the 
American Society for Information Science 23:3 (May-June 1972) 
156-163.- 

For a computer-based system to be accepted and used, it must 
be designed so \hat the effort required to obtain pertinent refer- 
ences from the computer is not much greater than the effort required 
using other methods. , 

The characteristics of informal methods of information gather- 
ing suggest five ways for minimizing the human effort expended in 
retrieving references from an on-line system. 

1. Allow the uster to shape the interaction to fit his needs. 

2. Retrieve few irrelevant references. 

3. Furnish references to the appropriate type of document 
(e.g., theoretical discourse/description of an appUca- 

> tion, review article, etc.). 

4. Provide direction for further search. 

' " 5. Deliver screened and evaluated references. 

Back, Harry B. and Richard L. Van Horn. "A System to Improve the Avail- 
ability and Usefulness of Management Science Knowledge, in Donald 
^ Walker, ed., Interactive Bibliographic Search; The User/Computer 
Interface . Montvale, N.J.: AFIPS Press, 1971, 19-43. 

The user interface features of a prototype retrieval system 
are described. A research experiment is suggested, results of which 
could be used to successively modify the interface debign. In this 
research, users would be given standard retrieval tasks. User "pro- 
tocols" would be monitored. Users would be asked to "think aloud' 
and would be monitored by tape recorder in addition to computer 
terminal records. Both written and oral user actions in problem 
solving would be analyzed. 
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Bennetc, John L. "Interactive Bibliographic Search as^a Challenge to 
Interface Design," in Donald E. Walker, ed. ,. Interactive Biblio - 
graphic Search; The User/ Computer Interface . Montvale, N.J.: 
AFIPS Press, 1971, 1-18. 



This paper was the challenge paper for the AFIPS Workshop 
on. "The User Interface for Interactive Search of Bibliographic 
Data Bases." 

Challenge points %iere: 

(A) Characteristics of the searchers served by the facility. 

(B) Conceptual framework presented to the searcher. 

(C) Role of feedb^-k to the searcher during search. 

(0) Operational characteristics of the faci^lity: the command 
language, display of formats, response time l 

(£} Constraints of the terminal and techniques to ameliorate 
them. 

(F) Effect of the bibliographic data base on the useri^ inter- 
face for search. 
<G} Introducing the search facility to the user. 
(H) Role of evaluation and feedback in the redesign cycle. 



Bennett, Joht. L. ''The User Interface in Interactive Systems,'^ in Carlos 

A. Cuadra, ed., Annual Reviev of Information Science and Technology . 
Vol. 7. Washington, D.C.: American Society for Information Science, 
1972, 159-196. 

A comprehensive review with 113 references. 

* 

Coles, Victor L. "Remote Evaluation of a Remote -Consoli: information- 
Retrieval System (NASA/REUON) ," in Interactive Bibliographic Systems . 
Proceedings of a Forum Held at Gaithersburg, Maryland, October 4-5, 
1971- Oak Ridge, Tenn.: U.S. Atomic Energy Commission, Office of 
Information Services, April 1973, 133-142. Open discussion, 142-149. 
. CONF-711010. 

Continuing evaluation is sought at the NASA Scientific and Tech- 
tc^l Information Office. NASA/RECON search results are sent to users 
rompanled by an evaluation form. The search procedure features 
delegated searching via written requests. Search analysts screen th^ 
prioTed output before sending it to the user. 

< 

Cook, K. H., L. H. Trump, P. Atheiton and J. Katzer. Large Scale Information 
Processing Systems . Final Report. Syracuse, N.Y. : Syracuse Univer- 
sity, School of Library Science, 1971. 6 vols. 

The full report on the SUPARS experiments. 
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tooper, W. S. "Expected Search Length: A Single Measure of Retrieval 
Effectiveness Based on the Weak Ordering Action of Retrieval 
Systems," American Documentation 19 (January 1968) 30-41. 

Given a set of retrieved documents ordered by expected re- 
levance, a measure of effectiveness is obtained relative to the 
user's quantification of the number (n) of relevant documents < 
desired. The expected search length is defined to be the number 
of nonre levant documents preceding the nth relevant document. 
This can be compared against an expected search length of a 
hypothetical, randomly ordered system output. The fractional 
reduction in expected search length in goin^ from the random to 
the actual system is called the mean expected search length factor . 

Fife, Dennis W., and others. A Technical Index of Interactive I nformation 
Systems. Final Report. Washington, D.C.: National Bureau of 
Standard s, March 1974. 79p. FGK56375. ED-092 163. 

The technical features and operational status of interactive 
information systems, i.e. those providing a conversational usage 
mode to: a non-programer through a data terminal device, are reviewed. 
The review is designed to aid information specialists in the state- 
of-the-art assessments preparatory to a detailed system selection 
procedure. It contains an i«tlex: 46 systems are listed by trade 
name. The index provides information about over 50 technical fea- 
tures. Information is based primarUy on documentation received 
during 1972 and 1973. In addition, there are aids and examples 
contributing to the intended use of the index. 

Fry Berna^rd M. Evaluation Study of ERIC Products and Services . Summary 
Volume. Final Report. Bloomington, Ind.t Indiana University, 
Graduate Library School, March 1972. 51 p. ED-060 922. 

Although the scope of this evaluation specifically excludes 
evaluation of the ERIC tape data bases, it is of interest for some 
of the comments about the data in RIE. Data gathered from individ- 
ual users' responses, site interviews, and advisory panels suggested 
the following changes or improvements in RIE .should be studied: 
(partial list) 

- merging institutional entries without regard to sub-divisions 

- coding level (age, elementary, high school, etc.) 

- coding type (speech, survey, report, etc.) 

- omitting or flagging non-available documents 

- indexing consistency as between general or specific 

- correcting unevenness in quality of documents. 

RIE was evaluated high on its range of topics, the contents of 
resumes, and the indexing system, but relatively low in other charac 
teristics, including quality of material selected and timeliness. 
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Interactive Bibliographic Systerog * Proceedings of a Forum Held at Gaithers- 
burg, Maryland, October 4-5, 1971- Oak Ridge, Tenn.: U.S. Atomic 
Energy Commission, Office of Information Services, April 1973. 
205 p. CONF-711010. * 

This volume contains papers and discussions in the areas of 
user interface, system configuration, economics and performance, and 
future developments. Many of the papers give detailed operating ex- 
periences • The open discussions are particularly interesting; they 
include bimitig and cost information. V^n Wente quotes an average 
response tttpe for a RECON command as 40 Wconds (average range in- 
cluding BEGIN, SELECT^ COMBINE and other \ommands) . Average sit-down 
time for a RECON user at Goddard Space Flight Center, NASA, was quoted 
as one-half hour, with thorough and extensive searches such as those 
needed before starting a new research project > easily lasting an hour- 
(p. 16, 17.) Coles reports slightly longer search times (45 minutes), 
though overall times for delegated searching average 1-1/2 hours. 

Jewell, Sharon and W. T. Brandhorst. Search Strategy Tutorial; Searcher *s 
Kit. Washington, D.C.: National Inst, of Education, October 1^73. 
86 p. ED-082 763. 

From the ERIC Data Base Users Conference, Columbus, Ohio, 
October 10-12, 1973* This document is the workshop manual used in 
a three-hour tutorial session on search strategies. The discussion 
of the input phase of a computer search covers identification of th^ 
user population, receiving the inquiry, and the types of services 
offered. General principles of good searching, search theory and 
general manipulative capabilities are discussed, as well as specific 
properties of the ERIC system that affect computer search capabili- 
ties. A practice session is included. The output phase of a computer 
search includes a discussion of output formats, output evaluation, and 
statistical records-keeping. 



Katter, Robert V. '^Insights in Implementing the Redesign Cycle,"* in Inter - 
active Bibliographic Systems > Proceedings of a Forum Held at 
Gaithersburg, Maryland, October 4-5, 1971. Oak Ridge, Tenn.: U.S. 
Atomic Energy Comqiission, Office of Information Services, April 1973, 
175-182. C0NF-7H0lp. 

Four classes of feedback are discussed: 

(1) System contact and u^e statistics. 

(2) User commentaries. 

(3) Output-efficiency evaluations . 

(4) Interaction recordings. 

••Sampling techniqut'S to record terminal interact i onrf and separatt' 
^ .. otl-linc proRrnmH lo analyze the data can ho combined to provide 
etticient recording and reduction of data.'* 
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Katzer, Jeffrey. "The Cost -Performance of an On-Line, Free Text Biblio- 
graphic Retrieval System," Information Storage and Retrieval 9 
(1973) 321-329. 

"Performance measures, such as recall and precision, do not 
supply any information about the operating efficiency of a system. 
What is needed, and what has been suggested for some time now, is 
performance characteristics paired, with cost measures." 

An estimate of recall was tabulated against costs at 10-90% 
estimated recall levels. Results indicated that. fvor all but the 
lowest recall levels, SUPARS I had better cost-peif brmance operating 
characteristics under the restrictions of imple logipal operators 
(using only OR and AND logic, versus using R, AND, NOT logic com- 
bined with word root searching). Another finding was that SUPARS 
I was very expensive 

SUPARS tl finding? indicated that on-demand access to the 
index or dictionary contributes s.igni f icnntl y to improving the cost 
performance . 

Katzer» Jeffrey. "The Development of a Semantic Differential to Assess 

Users' Attitudes Toward an On-Line Interactive Reference Retrieval 
System," Journal of the American Society for Information Science 
23:2 (March-April 197Z; lZ2-iztt. \ ] 

A user questionnaire employing nineteen 7-interval adjective 
scales, such as fast-slow, active-passive, good-bad is described. 

The major finding of the study was that users reliably respond 
toward such a system. Their affective responses can be conceptualized 
into three independent components: (1) the evaluation of the system; 
(2) the desirability of the system; and (3) ^ the enormity of the system. 



King, Donald W. and Edward C. Bryant. The Evaluation of Information Services 
and Products . Washington, D.C.: Information Resources Press, 1971. 
306 p. 

Chapter 2 contains a good discussion of performance measures 
based on user relevance judgments, including recall, precision, fall- 
out and generality ratios. Macroevaluation and microevaluation are 
presented in Chapters 3 and 4. 

King, Donald W., et al. Comparative Evaluation of the Retrieval Effective - 
ness of Descriptor and Free-Text Search Systems Using CIRCOL (Central 
Info rmation Reference and Control On-Line) . Rockville, Md . : Westat 
Research, Inc., January 1972. RADC •TR-71-311. ED-O*-? 137. Available 
through NTIS (AD-738 299). 
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study ccmpares the retrieval effectiveness of two alternative 
input ind search systems in terms of such measures as recall, ^alldut, 
precision, and total retrieval. One system operates using maftually 
indexiftd document files searched by controlled vocabulary, while the 
othey^ employs full text input using natural language searching. The 
^ resiilts indicate that the two systems perform at approximately the same 

lev^l of effectiveness, although estimated average total retrieval was 
foJad to be slightly greater for free-text searching than for descriptor 
sefarching at all levels of recall. 

9 . ■ 

Krevitt, Beth and Belver Griffith. "Tlie Evaluation df Information 

Systems: A Bibliography 1967-1972," in Information Pt. II, vol. 2, 
no. 6. New York: Science Associates International, 1973. p. 1-34. 

Th^ scope of this classified bibliography is limited to the 
design, testing and evaluation of information storage and retrieval 
systems. Contains sections on evaluation techniques and. on on-line 
interactive systems. 

Uncaster, F. Wilfrid. Evaluation of the MEDLARS Demand Search Service . 
Washington, D.C.: National Library of Medicine, January 1^08. 
276 p. Available from NTIS <PB-178 660). 

This report presents the results of a detailed analysis by the 
National Library of Medicine of the performance of liEDLARS in rela- 
tion to 300 actual requests made to'the system in 1966 and 1967. 
Delegated searches (demand search bibliographies) were requested in 
person, by mail directly, or by mail through a librarian or informa- 
tion specialist. A MEDLARS search was performed, and in addition 
to the printed output, photocopies of 25 to 30 retrieved articles 
(selected by random sampling if total retrieval was larger than 30) 
were sent to the user, who evaluated each article on a three point 
; scale (major, minor, no value), as well as on a fourth point ("glad 

to learn of article's existence because of some other need or pro- 
ject"). A recall base was obtained from known relevant articles 
supplied by the user, supplemented by a manual literature search. 
Recall, precision, and "novelty" ratios were obtained'. (The novelty 
ratio is based on the "other need" answer.) The system was shown to 
be operating, on the average, at about 58% recall and 50Z precision. 
However, search results were widely scattered; some achieved high 
recall and high precision; others achieved completely unsatisfactory 
recall results. A detailed failure analysis was performed. 

'"^ ; ( 

Lancaster, F. Wilfrid. Information Retrieval Systems. Characteristics, 
Testing and Evalaation . New York: Wiley.»'-1968. 222 p. 

This book received the American Society for Information Science's 
award for best book of the year on information science, in 1970. It 
is concerned primarily with "intellectual" factors that significantly 
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affect the performance of all information retrieval systems, namely: 
indexing policy and practice; vocabulary control; searching strategies; 
interaction between the system and its users. 

Recall and precision are selected as the most important measures 
of system performance; indexing, search strategies and other factors 
influencing this performance are discussed, using detailed examples. 
Failure analysis plays an important part in the overall analysis. 



Lancaster, F. Wilfrid. "MEDLARS:. Report on the Evaluation of its 

Operating Efficiency," American Documentation 20 (1969) 119-143. 

A comprehensive program to evaluate the performance of MEDLARS 
was conducted by the National Library of Medicine in 1966 and 1967. 
This report describes the methodology used and presents a summary 
of the principal results, eonclusions, end recommendations. The 
detailed report on this study is listed above (PB-178 660 ) . 

Lancaster, F. Wilfrid. Evaluation of On-Line Searching in MEDLA RS (AIM-TWX) 
by Biomedical Practitioners . Urbana, 111.: Illinois University, 
Gra<iuate School of Library Science (Occasionai- Papers 101), February 
1972. 21 p. ED-062 989. 

The purpose of the investigation was to determine how effective- 
ly biomedical practitioners, with a minimum of introduction' to the . 
system, can conduct on-line searches to satisfy their own information 
needs. Forty-eight searches were conducted by biomedical practitioners 
on Abridged Index Medicus (AIM-TWX) . Trained search analysts then 
structured and conducted searches on the same subject. It is concluded 
that many biomedical practitioners, coula -xploit . AIM-TWX profitably 
with the minimum of introduction to the sysrem and-without the neces- 
sity of using a trained MEDLARS analyst. Limitations of the ELHILL 
search system (SDC's ORBIT as modified for National Library of Medicine 
use) mentioned were: ELHILL should be less error-aensitive ; more cross 
references are needed in the vocabulary file. Potential improvements 
suggested include term weighting; visual displays (CRT type); cluster- 
ing techniques whereby documents "like" a given document could be 
found; and acceptance of approximate keywords. 



Lancaster. F. Wilfrid and E. G. Fayen. Information Retrieval On-Line . Los 
Angeles. Calif.: Melville, 1973. 597 p. (A Wiley-Becker & Hayes 
Series Book.) 

This book provides a broad survey of the characteristics, capa- 
bilities, and limitations of present on-line interactive systems for 
bibliographic search and retrieval. The emphasis is on the design, 
evaluation and use of such systems, primarily from the viewpoint of 
the planner and manager of information services. It is oriented 
toward the "intellectual" aspects of information retrieval rather 
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than the hardware or progranming aspects. Chapter 8: Evaluating 
Effectiveness of the System, and Chapter 9: Operating Experience and 
Evaluation Results, discuss evaluation methodologies and results from 
the several .on-line searcH" systems, including AIM-TWX, DI^LQG, SUPARS 
and others. 



Uncaster, F. W., Richard L. Rapport and J. Kiffin Penry. "Evaluating the 
Effectiveness of an On-Line, Natural Language Retrieval System," ^ 
Information Storage and Retrieval 8:5 (October 1972) 223-245. 



An evaluation of the Epilepsy Abstracts Retrieval System 
(EARS) was performed. Searches were conducted on the on-line 
system and evaluated in terms of recall, precision, and general 
user satisfaction. Searchers (who were doctors, not search 
analysts) filled out forms before starting to search, including 
identification, time started, relevant abstracts retrieved (these 
were to be looked up by number in the hard copy of Epilepsy Ab- 
stracts, lofated close to the terminal), and total elapsed time. 
Parallel searches were conducted by experienced searchers on the 
same topics . , 

i* ■• . - 

. V. ...... 

Martin, Thomas H,...'f MAdt\ire Analysis of Interactive Retrieval Systems . 

Stanford , CatTfTl Stanford University, Institute for Comnunicat ion 
Research, September 1974. 100 p. SU-CQMM- ICR- 74-1. Available 
from^NTIS. 

/ 

The command language features of eleven different on-line 
information retrieval systems are presented in terms of the func- 
tional needs of a searcher sitting at a terminal. Functional areas 
considered are: becoming familiar with the system, receiving help 
when in trouble, regulating usage, selecting a data base, formulat- 
ing simple queries, expressing single conceptSj interconnecting 
concepts, displaying results simply, and controlling the display. 
Featurps felt most essential to on-line searching are live help, 
users' guides, boolean operators, search field control, suffix 
removal, relational operators, dictionary access, request sets, 
search review, predefined formats, on-line formatting, and off-line 
printing. It is concluded that no sharp distinction exists between 
management information and bibliographic retrieval. The report is 
intended for use by designers of interactive retrieval systems and 
by students of system design. 

Martin, Thomas H. "The User Interface in Interactive Systems," in Carlos 

A. Cuadra, ed., Annual Review of Information Science and Technology . 
Vol. 8. Washington, D.C.: American Society fgr Information Science 
1973, 203-219. 

- # 

This review focuses in the conceptual aspects of interaction. 

48 references. 
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Martin. Thomas H., James Carlisle and Siegfried Treu. "The User Interface 
for Interactive Bibliographic Searching: An Analysis of The Atti- 
tudes of Nineteen Informatiim Scientists." Journal of the American 
Society for Infortnation Science 24:2 (March-April 1973) 142-147. 

Results of a questionnaire administered to 19 information 
scientists at an AFIPS/ASIS sponsored workshop on user computer inter- 
face. This workshop was a. sequel to the AFIPS workshop reported by 
Walker. 147 propositions (software and hardware features or response 
patterns) were presented and respondants were asked to rate them on 
a five point scale from "too rigid" to "too flexible." Cost was not 
to be considered. Consensus was reached on 70 items at the .025 
probability level. 



Meister, David and Dennis J. 'Sullivan. Evaluation of User React iond to 
a P rototype Information Retrieval System . Canoga Park, Calif.: 
Bunker-Ramo Corp., October, 1967 . .62 p. (NASA-CR-918) . ED-Q19 094. 

This early evaluation" of the experimental RECON retrieval system 
as impleiflented by Bunker-Ramo Corp., was conducted using two separate 
measures to determine acceptability and usability: (1) frequency of 
system usage, and (2) personal opinion of the user population. A 
second method of evaluation consisted of measurinR the accuracy and 
speed of RECON as compared with the major existing information re- 
trieval method, an off- fine computer search, formulated by librarians. 

Users were satisfied that on-line searching was faster than off- 
line batch searching or manual searching, but felt that RECON' s response 
time was very slow. 



Melnyk. Vera. "Man-Machine Interface: Frustration." Journal of the American 
Society for Information Science 23:6 (November-December 1972) 392-401. 

As an exploration of the frustration experienced by users of an 
on-line interactive retrieval system, students participated in an ex- 
periment using an experimental reference retrieval system for library 
literature. Subjects were monitored by observation through glass 
partitions, in addition to a questionnaire on their emotional state. 
The control group received instruction and a plnn for searching; the 
expf rimtsntrtl group received a demonstration oit one syMtem, but h/id to 
use two other, undemonstrated systems. The experimental group experi- 
enced much more frustration. 



Mittman, Benjamin and Wayne D. Dominick. "Developing Monitoring Techniques 
for an On-Line Information Retrieval System," Information Storage and 
Retrieval 9 (1973) 297-307. 

Northwestern University's RIQS (Remote Information Qu«»ry System) 
generates the RIQSLOG, which logs all user actions and system actions. 
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This log is processable by the RIQS programs. The monitor was designed 
to provide data for analysis of system response, user errors, query 
complexity, use of resources such as central processor, time required 
for searching, etc. During winter quarter 1972, students were monitored 
in some 130 on-line sessions, contnining approximately 625 individual 
queries against a data base which contained 157 records from articles 
which had been published in the Communications and Journal of the Asso- 
ciation for Computing Machinery . 

Average real time per session (probably analogous to DIALOG'S 
search) was 23.9 minutes, with a range from 0.5 to 11K3 minutes. 
CPU time per query averaged 0.6 seconds, and ranged from Q.04 to^ 
28.9 seconds. Real time per query (probably analogous to DIALOG s 
question) broke down as follows: 

- in 72 percent of the. queries, real time for query input was 

less than 3 minutes 

- in 92 percent of the queries, real time for query input was 

less than 6 minutes , 
« in 4 percent of the queries, real time for query input was 
greater than 8 minutes, with a maximum of 22 minutes. 

A ratio was plotted: real time for query input over total real time^ 
for input and execution of that query. The ratio lay between 0.8 and 
1.0, indicating that a very substantial amount of the real time for 
performing a anarch is attributable to entering the query into the 
system. 

' The initial attempt at relating query complexity to search time 
was to use a simple count of search terms. Large numbers of search 
terms associated with relatively little CPU time were observed for 
queries which tended, to use AND operators (causing the algorithm to 
terminate execution of the Boolean combinations at the first false 
comparison). 

A performance equation was developed which involved CPU pre- 
dicted time, nurtber of words generated to output reports, number 
of records scanned, and a measure of statement complexity. The 
measure of statement complexity "could not be obtained deterroinis- 
tically," but was manually assigned from a visual scan of the text. 
(The article does not elaborate further on the measure of statement 
complexity.) 

Rosenberg, Victor. "A Technique for Monitoring User Behavior at the 

Computer Terminal Interface," Journal of the America n Society for 
Information Science 24:1 ( January-February 1 / 3 ) . 7lT 

Description of a "two track" user behavior observation. Track, 
one is a printout of all communications between user and computer, 
with a time log in the margin indicating elapsed time. Track two 
is obtained with a tape recorder. The user is asked to 'think aloud, 
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to state the problem, the developing search -strategy, evaluation of 
system performance, etc. This is transcribed and titne-keyed, as was 
track one and the two are compared. Rosenberg comments on the dif- 
ficulty of dealing with the resulting mass of data but suggests that 
it can be worth it for valuable insights obtained. 



Summit, Roger K. "DIALOG and the User — an Evaluation of the User Inter- 
face with a Major On-Line Retrieval System," in Donald E. Walker, 
ed. , Interactive Bibliographic Search; the User-Computer Interface . 
Montvale, N.J.: AFIPS Press, 1971, 83-94. ^ 

A description of the DIALOG retrieval system is given. Ease 
of use is stressed. Additional details on the evaluations are 
described in Tirabie. ,' . , 

Summit, Roger K. Homote Information Retrieval Facility . Palo Alto, Cnlit.: 
Lockheed Missiles and Space Co., April 1969. p. (NASA-CR-l 318) . 

Describes the use of DIALOG by NASA. It was found that end- 
users tended to use more complex logic and mo^e terms than inter- 
mediary searchers. Complexity was indicated by the number of logical 
AND connectors used by searches. 

/ 

Tell, Bjorn V., and others. The Use of ERIC Tapes in Scandinavia;' Search- 
ing with Thesaurus Terms in Natural Language^ Strasbourg, France: 
Council o^ Europe; and Stockholm, Sweden:/ Council for Cultural Co- 
operation; Royal Institute of Technology, 11 November 1972. 23 p. 
(ECS-DCC-72-15). ED-072 794. , 

This is a description of the batch processing SDI system used 
in Sweden, using the ABACUS and VIRA programs to search the ERIC 
files* The high noise Level of the ERIC data base is mentioned; in 
one case (a search on audiovisual aids for the mentally retarded) 
it was found to be about 40% « This is considered quite high» con- 
sidering that ERIC is a central data base for this sort of request. 

A weighting procedure is suggested, based on terra-usage fre- 
quency. Tell suggests that high frequency terms are looked upon 
as having less value than those with low frequencies. (Williams 
mentions this also.) A value assignment of 1/n, where n is the 
number of term postings over a large sample from each data base, 
say 30,000 references, would allow output printout to be ordered 
according to the sum of specificity of retrieval terms. 

(Note: Simply because a term has low "information value" 
does not mean that it is useless in a search. It may provide a 
background against which ^^ther terms must be searched.) 
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limbic » Michele and Don Coombs. An Interactive Inforroation Retrieval 
System; Case Studiea on the Use of DIALOG to Search the ERIC 
Document FTYe^ Stanford, icaUf . : Stanford University, ERIC 
Clearinghouse of Educational Media and Technology, December 1969. 
90 p. ED-034 431. 

User studies. A synopsis of this study is reported in Lancaster 
- (Information Retrieval On-Line, 1973). 



Treu, Siegfried. A Computer Terminal. Network for Transparent Stimulation 
of the User of an On-Line Retrieval Sy»te^ - W«ahington. D.C: 
National Bureau ot Standards, Center Cor Computer Sciences and 
Technology, July 1972. 39 p. (NBS-TN-732^ . ED-070 461. 

A computer terminal network to enable" "transparent, stimul&iion" 
of the user of an on-line retrieval system has been designed, imple- 
mented, and pilot tested. Its basic purpose is to provide a suitable 
and effective framework and methodology for experimental identifica- 
tion/validation of those human characteristics lAich should be recog- 
nized/reinforced in man-computer interface design. The rationale 
behind the transparent stimulation approach is presented and the 
^ methodology employed for such real-time, unobtrusive scanning and 

manipulation of the man-computer dialogue is described. A general 
overview of the hardware and software features of the implemented 
stimulation network is included. 

Treu, Siegfried. "A Conceptual Framework for the Searcher-System Inter- 
face," in Donald E. Walker, ed., Interactive Bibliographic Search: 
The Uaer-Computer Interface . Montvale, N.J.: AFIPS Press, 1971, 
53-66. 

This article discusses the terminal dialogue monitoring capa- 
bility being developed ^ the National Bureau of Standards. Trans- 
parent stimulation (%<hereby a person at a remote location causes 
the comouter to prompt attitude-related questions) is also discussed. 



Treu, Siegfried. "Techniques and Tools for Improving the Interactive 

System Interface," in Interactive Bibliographic Systems . Proceedings 
of a Forum Held at Gaithersburg, Md., October 4-5, 1971. Oak Ridge, 
Tenn.: U.S. Atomic Energy Commission, Office of Information Services, 
April 1973, 32-38. 

Two specific data collection tools have be-n developed at the 
Center for Computer Sciences and Technology of the National Bureau 
of Standards. The dialogue monitor records the time of sending and 
receipt of each command in both directions. Thus data on the system 
response time after the user hits the carriage return key, the time 
the system takes to transmit an entire message, or the time from re- 
ceipt of the system message to the next user input (i.e., user think 
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time and keying time), may be obtained. Transparent stimulation is 
recommended as an unobtrusive monitoring technique, and is described 
as follows: 

"The software is loaded by the (remote) observer, who has both 
a teletypewriter and a specially constructed emotion-reason iudlv«itor 
device available to him. Whenever he wishes (or the software demands) 
during the course of a user-system dialogue, tlie observer can request 
that the user indicate his current level of satisfaction (e.g., whether 
annoyed, frustrated, happy as well as the reason therefore. The 
specially designed and constructed terminal that is available to the 
user enables him to push the appropriate labeled buttons after being 
prompted by a light and bu«ser. The observer Ms an exact copy of 
that terminal except that it has lights where the user termiiial has 
buttons and ^ prompt button in the place of a light." 

• • 

The messages created in the transparent stimulation mode are 
recorded by the dialogue monitor, along with their times. 

Walker, Donald E.. ed. Interactive Bibliographic Search; The UHer-Computer 
Interface . Proceedings ot a Workshop held in Palo Alto, Calitornia, 
on li-i5 January 1971. Montvale, N.J.: AFIPS Press, 1971. 404 p. 

This workshop was devoted to problems and prospects for more 
effective systems design of the user interface. A challenge paper 
and several papers prepared in response are included. The discussions 
are very interesting. Opet«ting experiences from a variety of on-line 
interactive search systems are discussed, with special regard to the 
user-computer interface from the user's perspective. 

Wiederkehr, R. R. V. "Part I: The Literature Perspective," in Evaluation 
of Document Retrieval Systems; Literature Perspective, Measurement, 
Technical Reports . Bethesda, Hd.: Westat Research, Inc., December 
1968, 1-15. rPB^182 710). 

This is a good survey of the literature of the I960's ou the 
evaluation of information retrieval systems. This report was one of 
two volumes comprising the first draft of the 1971 King monograph 
described earlier. 

Williams, J. H. , Jr. "Functions of a Man-Machine Interactive Information 
Retrieval System," Journal of the American Society for Information 
Science 22:5 (September-October ,1971) 311-317. *" ' 

Describes the BROWSER system of IBM Federal Systems Division. 
The BROWSER system uses a free-form query and produces weighted 
output. Specific terms (with few postings) are associated with 
higher values; terms with many postings have lower values. 




IV. BACKGROUND INFORMATION 



A. THE ERIC DATA BASE 

In the mid-*1960'^ the U.S. Office of Educi^tion established the 
EducationtI Resources Information Center (ERIC) to provide 4ccess to 
literature in the field of education^ Through the long-tenv support of 
the Office of Education, and currently the National Institute of Education^ 
ERIC has grown to become one of the leading social science information 
resources in existence today. ^ 

To acqMire and select material for inclusion in the ERIC data, base 
, a network of clearinghouses was established (presently 16 clearinghouses) » 
each with special expertise in a particular area 6f education* The clearing- 
houses compil4» bibliographic information about each publication selected » . 
index tach publication using a controlled vocabulary of Descriptors » assign 
Identifiers^ and in some cases write an abstract or brief annotation tor 
> the publication. The records thus prepared by each clearinghouse are then 
* sent to a central processing center for further processing. 

■ V. 

The two basic printed products of ERIC sre the Reseaych in Education 
(RIE) journal and the Current Index to Journals in Education (CIJET Both 
are published monthly. Concurrently, machine readable versions of the RIE 
^and CUE filjes are produced on magnetic tape. These tapes are available 
at nominal crost on a monthly subscription basis to organizations that wish 
to search the ERIC files by computer. There are presently aboi^t 100 ERICTAPE 
subscribers. Two other files of educational material are also available 
on magnetic tape. These files, dealing with topics of vocational and 
technical education and produced and distributed by the ERIC Clearinghouse 
on Vocational and Technical Education, are Abstracts of Instructional 
Materials (AIM) and Abstracts of Research Materials C^'^). 



I , anfl 



In addition to the RIE and CUE pvblicelii^ns, anfl the correspond- 
ing files on c^gnetic tape, a variety of printed listings and indexes are 
published to aid the searcher. Also, the ERIC Document Reproduction 
Service offers both microfiche and printed copies of all non-copyrighted 
reports announced in RIE. 



of 

from 



A more complete description of the ERIC system including its scop« 
coverage, products, services, and operational components m^ be obtained 
m the reports of the U.S. National Institute of Education. 
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B. SEARCHING THE ERIC DATA BASE BY COMPUTER 

With the creation and monthly updating of machine readable files 
and their ready tvailability, much searching of the ERIC document collec- 
tion 18 now beingvdone using computer8» both in batch mode and on-line. 
(Batch mode is the procedure of submitting .one or more independent search 
requests to be processed by the computer with no interaction between 
machine and searcherV Typically, in batch mode several days el ap&e iu'twoon 
submission of the se^ch requests and receipt of the computer output. If 
modification of a^batc^ request is indicated, a similarly long interval is , 
required for receipt of the new results. On-line operation implies an 
interaction between comp»»ter and searcher during the search process that 
allows immediate le^dback of results and immediate modification of the 
request when desired.) On-line searching of the ERIC data ba.^e is made 
possible through the facilities of central processing centers which provide 
the required hardware and software for the siearcher as well as maintaining | 
the ERIC bibliographic files in direct access oriented machine-readable 
form. 



A survey of the use of ERIC tapes for computer searching is given \ 
in a recent report by Embry.3 a more detailed review of several installatidns 
that search ERIC tapes in a batth mode is given in a recent report by Humphrey. 
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C. LOCKHEED DIALOG SYSTEM 



The Lockheed .Palo Alto Research Laboratory has, for a number of 
years » operated an on-line search facility providing access to a variety 
of bibliographic data bases. Since 1969 the ERiC data base has been 
^included in the files maintained by Lockheed. 

As shown in Figure 2, a subscriber to the Lockheed on-line re- 
trieval service has local terminal equipment (reriiote from the computer), 
the required telephone communication link» and on-line access to the 
Lockheed Palo Alto computer and ERIC data base during a service period* 
of approximately 12 hours each working day. 

The subscriber's terminal equipment is typically: 1) a cathode 
ray tube (CRT) video display with keyboard; or 2) 4 hard copy printing 
terminal (mechanical , thermal i etc.) vi^th keyboard; or 3) a CRT terminal 
vith an auxiliary printer to print out at the user's option, selected 
portions of the transmissions. 

The terminal equipment and communications channels can both be 
obtained to handle a wide range of transmission rates. Most of the gen- 
erally available terminal equipment (mechanical or video) operates at 
speeds of 10» 13, or 3b characters per second; many models are available 
that will operate at 120 characters per second, or even 480 characters 
per second* ^ 

Even though some units of the terminal equipment might be able to 
hand] 3 a high data rate, the actual data rate will be limited by the ca*** 
pacity of the communication channel (i.e., the phone line) and the inter- 
face to the computer equipment. Most on-line tarminals now operate in a 
dial-up mode in which the subscriber uses an ordinary telephone handset 
to did up the number of the computer, and then puts the handset into an 
an acoustic coupler to connect the signal to the terminal Equipment. 
The dial-up-^rangement result i^in the* use of whatever telephone line 
and circuit is sele^.tBd--hy^ telephone switching equipment; that is, 
a normal voice 'grade line, h tetimes good— but sometimes not too good. 
The data rate achievable ovei such ordinary dial**up lines is theoretically^ 
about 1900 characters per second; however, in practice most dial-up 
terminals operate at a lover rate (seldom more than 120 characters per 
second) because of considerations of the quality of the data transmission 
(i .e. problems with available MODEMS, transmission, noise and error rates 
for higher speed'^). 

Most dial-up users not in the immediate vicinity ot Palo Alto 
currently use«^Tymahare's TYMNET network to access the DIALOG system* This 
network provides local phone numbers in over, 50 major cities at an hourly 
connect cost of $10*00* This compares favorably with direct distance dial- 
ing which may cost as much as $30.00 per connect hour. TYMNET currently 
supnorts terminals in the 10 to 30 characters per second speed range. 

Instead of using a di&l-up line with its fluctuating quality control 
problems, some users prefer to lease a , line for exclusive use as the con- 
necting link between the terminal and the computer. A leased line, because 




it is both clearly identified and dedicated to this single application, 
can be inspected, modified or specially conditioned, and maintained by 
the telephone company to provide potentially the best performance possible 
for a telephone line. Such leased lines often operate reliabiy at speeds 
up to 480 characters per second/ Because the leased line access is p^ro- 
vided at a fixed monthly cost with unlimited access (up to 12 hours per 
day currently), this method has an economic as well as operational advan- 
tage for the high level user. Furthermore, because the leased line is always 
connected to the computer, there is never any difficulty with a busy 
signal because all of the available lines are being used (i.e., you are 
never denied a port). 

The data that is transmitted over the telephone lines must go 
iJthrough a transformation process between the terminal and the phone lines, 
digital signals generated at the terminal must be transformed into equiva- 
lent audio analog signals for transmission over the phone lines, and vice 
versa for transmission to the terminal. This requires some type of 
MOdulator-DEModulator (MODEM) equipment. Such equipment can be units 
separate from the terminal equipment (e.g.. Bell Telephone Co, Datasets) 
that take an electrical analog signal from the phone line and directly 
convex w it into an equivalent electrical signal in digital form, or vice 
versa. However, equipment is also available (acoustic couplers) that will 
take the telephone audio signal as heard through the handset, and acous- 
tically transform that signal back to an equivalent electrical signal, and 
vice versa. Using this relatively inexpensive equipment eliminates the 
need to rent the generally more expensive MODEM equipment. The acoustic 
couplers are^normally purchased as separate units of equipment, but are 
sometimes built directly into the terminal equipment. The acoustic ^ 
couplers work quite well at speeds of 30 characters per second, and there 
are even some units that can operate at speeds up to 120 characters per 
second. 

The computer equipment interface to the telephone lines may also 
have some restrictions on data transmission rates. The Lockheed computer 
that was in use at ihe time of this study had input ports that accommodated 
transmission rates of 10, 15, 30, 120» and 480 characters per second. 

The (i^int of this discussion is to note that there is a wide 
variety of terminal facilities available, with greatly different charac- 
teristics and data transmission rates. The high speed equipment is more 
expensive, but can process a search faster than the low speed equipment, 
and may be more cost effective at some moderate level of search activity. 
All of the installations experienced in this study had somewhat different 
terminal equipment, and it was expected that this would be related to their 
search capacity and productivity. A summary of the terminal equipment used 
at each of the installations studied is given in Table 3. 

The Lockheed hardware facilities in Palo Alto consist basically of 
an IBM 360/50 computer (to be upgraded to an IBM 360/65 in December 197A) 
with both disc and data cell auxiliary storage with capacity for storin>», 
over 5 billion characters of data bases, plus communication equipment to 
accommodate a large number of remote terminals. 
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TABLE 3 



TERMINAL EQUIPMENT USED BY THE INSTALLATIONS STUDIED 



CRT 



Hard Copy 
Terminal Equip- 



Terminal ment (iill opera- 
Installation Equiianent tin^^ 30 char/sec) 



3a» 



3 
U 

5 
7 
9 

'88 
llU 

125 



CC-30 , GE Tenninet300 
CC-30 GE TerminetSpp 
GE TerminetSOO 
CC-30 GE Terminet300 
CC-30 GE Terminet300 
CC-30 GE Terminet300 
CC-30 GE Terminet300 
GE Terminet300 
GE Terminet300 
GE Terrainet300 



Communication 
Equipment 
and Lines 



Modem, 
leased line 

Modem, 
leased line 

Acoustic coupler, 
dial-up line 

Modem, 
lesised line 

Modem, 

leased line' 

Modem » 
leased line 

Monenif 
leased line 

Acoustic coupler^ 
dial-up lin^i 

Acoustic coupler, 
dial-up line 

Acoustic coupler, 
direct line 



Data Trans- 
mission RLce 
( characters 
per second) 



U80 
2U0 

30 
li80 
U80 
1*80 
U80 

30 



30 



30 



*The 3a installation operated until July, 1973 and was then transferred to 
another organization and identified as installation 3* Installation 3 is the 
one that is analyzed in this report, but 3a data is included iri some tables 
for additional background information* / 



The subscriber comnunicates with the Palo Alto laboratory via the 
DIALOG interactive comnatid language. Messages or conunands are entered 
by the searcher on the terminal's keyboard » and output from Palo Alto is 
displayed on the CRf" screen and/or printed in hard copy at the searcher's 
terminal depending upon! the type of terminal used and the output option 
selected by the user. The ERIC/DIALOG interactive language consists of 
approximately 13 basic commands that allow the user to define sets of 
documents indexed with specified terms or identifiers, to combine defined 
sets with complete logical flexibility* to browse the ERIC thesaurus, 
and to select from a variety of output options. It is not the inte*ntion 
of this report to provide a description of the ERIC/DIALOG language. 
For this the reader is referred to the DIALOG Terminal Users Reference 
Manual,^ and several other publications that describe the DIALOG system.^ 



1 \ 



\ 

\ 

\ 



4li 

38 



D. 



REFERENCES 



1. ERIC; A Profile . Washington, D.C. National Institute of Education. 
Educational Resources Information Center. Undated. 

2. %ilC Processing Manual; Rules and Guidelines for the Acquisition, 
Selection, and Technical Processing ot Docuaents and Journal Articles 
by the Various Components o^ the ERIC Network . National Institute of 
Education. Educational Resources Information Center. July 1974. 
544 p. ED-092 164. 

3. Embry, Jonathan D., Wesley T. Brandhorst, and Harvey Marron Survey 

of ERIC Data Base Search Services . Washington, D.C. National Institute 
of Education. Educational Resources Information Center. July 1974. 
29 p. 

4. Humphrey, Allan. Survey of Select^^i Installations Actively Searching 
the ERIC Magnetic Tape Data Base. in Batch Mode . Vol. I. Berkeley, 
Calif.": — Univ. of California. Inst, of Library Research. June 1973. 
86 p. ILR-74-003. 

5. DIALOG Terminal Users Reference Manual. Palo Alto, Calif . ; Lockheed 
Palo Alto Research Laboratory. Lockheed Retrieval Service Informetion 
Systems Laboratory, n.d., var. paging. 

6. Summit, Roger K. "DIALOG and the User — an Evaluation of the User 
Interface with a Major On-Line Retrieval System," in Donald E. Walker, 
ed.. Interactive Bibliographic Search: the User-Computer Interface . 
Montvale, N.J.: AFIPS Press, 1971, 83-94. 

7. Summit, Roger K. ERIC On-Line Retrieval System. Use of th e DIALOG 
O n-Line InformatioiTRetrieval System with ERIC Research in Education 
Files . — Final Report. Palo Alto, Callt.; Lockheed Aircraft Corp. 
April 1970. 58 p. ED-040 592. 



O - 

ERLC 



40 

39 



V. ANALYSIS OF TERMINAL INSTALLATIONS 

/ 

A. INSTALLATIONS STUDIED 

The number of installations that search the ERIC data base on- 
line using DIALOG varies from month to month and from hour to hour. As 
described later in this report, an important element of our investigation 
was a detailed examination of all ERIC/DIALOG aearch activity carried out 
by nine terminal installations during 13 selected days during October and 
November 1973. There were a few other organizations that were conducting 
ERIC searches using DIALOG during this same time period. However, these 
were not included in the detailed examination because they were not among 
those suggested by OE for study. 

Data was collected for each of these installations by site visits, 
telephone discussions, and analysis of Lockheed computer records. 
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B. LOCAL CHARACTERISTICS 



Five of the installations inclu«i«d in this report were visited 
in late 1973 by one of the staff meib^ta of this project. The site 
visits were made for the purpose of cbtairiing an appreciation of the 
ways the various centers operated, and also in order to note any dis- 
tinguishing features which might influence terminal productivity. 
Several local factors could conceivably affect productivity such as 
the number of staff members trained in terminal searching, subject 
training of the searchers, previous experience of the searchers, the 
length of time that the center had been in operation, and a cost- 
recovery pricing policy for the users. These and other questions were 
considered for each of the sites visited. 

A brief description of the setting and characteristics of each of 
these sites is given below. Some of this data is summarized in Table 4. 
The only characteristics that were common to all of these installations 
visited were: 1) all of the searching staff were professionals; 2) all 
of these centers had CRT terminal equipment . 

When considering the information in this table and in this section 
it should be kept in mind that the data reflect the situation as of late 
1973. and may not necessarily describe those centers at the present time. 

1. Terminal 2 

This installation employs three full-time searchers. This center 
has been doing on-line searches since 197d. Currently their clientele 
consists primarily of employees of the U.S. Department of Health, Educa- 
tion and Welfare and also some people from other federal agencies. They 
use the SEARCH SAVE feature extensively and have been building their own 
SEARCH SAVE index since June 1973. 

2. Terminal 4 

This center has one full-time searcher, and is unique as a 
searching facility in a number of ways. It is an ERIC clearinghouse 
and has its own in-house abstractors, and searches are run primarily 
using their own Thesaurus. Because the abstractors are nearby, they 
can be consulted by the searcher to help resolve problems regarding the 
way concepts might actually be indexed. Most queries can be handled 
with two concepts and only one or two terms per concept. A search 
generally produces 30-60 citations. A duplicate copy of each search is 
maintained so that if the same query comes in before the next file .up- 
date, the duplicate can be sent out without having to run the search 
again. 

3. Terminal 5 

Terminal 5 has 11 part-time searchers, eight of whom have had 
considerable experience as school teachers. The staff members are gen- 
erally quite familiar with ERIC. Their policy is complete and thorough 
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service, including use of manual searching to augment DIALOG when appro- 
priate, and sending hard copy or microfiche reproductions of cited material 
upon tequest. DIALOG output is thoroughly revie%»W for relevance; even 
the original documents, as well as ERIC abstract8»\are considered in making 
relevance judgments. The order of priorities in reducing the size of 
bibliographies which are too large (100-150 being thk usual limit) is: 
1) limit search to major terms only, 2) limit to CUE References only, 
3) limit by date of document acquisition by ERIC. Thi^installation some- 
times uses intermediaries in the field to relay quei^ies. 

4. Terminal 7 



This center employs 14 searchers (some full-, some paVt-time), most 
of whom have an advanced degree in education or another subject area. This 
center automatically sends its clients hard copy or microfichef, reproductions 
of six to ten items 'cited in the DIALOG bibliography. Searchers improve their 
on-line searching speed by making frequent use of EXPAND-SELECT combinations 
using the chaining method of entering commands (described in a later section), 
and by doing most of their document screening activities off-line (by using 
the REMKARD microfiche storage device). Since they are searching for a few 
"most relevant" documents they search primarily descriptors, use the LIMIT/ 
MAJOR feature extensively, and generally aim for high precision and low 
recall. This factor might be expected to result in less on-line search time 
spent per question. When appropriate, DIALOG searches are augmented by a 
manual search in their library of current publications. This center makes 
extensive use of intermediaries in the field to report queries in natural 
language . 

% 

5. Terminal 9 

Terminal 9 is the newest of the centers visited by the ILR staff, 
having begun on-line searching just three months prior to the site visit. 
This factor. might be expected to affect the center's productivity for a 
time. This installation employs two full-time searchers. All searches 
are done using DIALOG, but a DIALOG search may be manually augmented if an 
in-depth search is requested. This center's services are free to clients 
in its local area but those outside the area are charged $15 for a regular 
search (DIALOG bibliography only) and $25 for an in-depth search. When 
DIALOG sets are too large, rather than limiting searches to a definite 
maximum output, clients are telephoned and search questions re-negotiated. 
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VI. ANALYSIS OF ERIC/DIALOG USE 



A. UTTRODUCTION ' . 

Several steps were taken to explore the basic question of this 
study I "Why are there wide-spread variations in questions processed per 
hour across installations?" The project pursued the following major 
sequence of supporting studies: 

* 

1. An investigation of ERIC/DIALOG systen response time; 

2. A detailed examination of searcning patterns af nine . 
installations as provided by a special computer log ("trace 
histories") of individual DIALOG coomanda executed by these 
nine terminals during search operationa; 

3» A classification of questions processed by the nine installa- 
tions, according to complexity; 

4. A review of the operating policies and procedures of the major 
users of ERIC/DIALOG during the time period investigated; 

3. Analysis of the data obtained. 

To aid our investigation, an ERIC/ DIALOG terminal was installed at 
ILR from August, 1973 to March, 1974. The project staff used the terminal 
extensively during this period and oh the basis of this experience, and 
discussions with «earchers from other installations, formulated some genersl 
search guidelines which are reported in a later section of this report. 
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B. STUDY OF ERIC/DIALOG SYSTEM RESPOtlSE TIME 



1. Introduction , . 

The first step carried out in this investigation was a detailed 
study of the response time of the on-line conputer system. This was 
done to explore, the hypothesis tHat the average response time and re- 
sulting search rate experienced by a given terminal might be affected 
by peak loading of the computer system. That is, terminal use at the 
busiest days or hours of computer use might experience a slower search 
rate. Consequently, any terminal installation that, because of local 
scheduling or East-West Coast hours of operation, tended to come on at 
the peak hours might experience a systematic lowering of its average 
searth rate. For example, if the system response were significantly 
faster between the hours of 5:00*7:00 AM then the users on the East 
Coast might be expected to have ehorter elapsed times on the average 
than those on the West Coast who come on the system at 6:00 AM. (All 
times used in this report are PST, local California times.) In this 
experiment, therefore, an attempt was made to measure rhe system (equip- 
ment, ronmiunications, programs) response time only; otj jr time which ^ 
might aormally be spent at the terminal— such aa time spent thinking 
about what command to enter next, or reading items on the screen—was 
reduced to a minimum. 

This data collection task aimed to find out lAiether the system 
response time (day of week or time of day) could possibly be a factor 
in the differing average search rates experienced for each of the 
terminal installations. 

We were also interested in determining the difference in search 
speeds that might be due strictly to the typing speeds and other mechanical 
skills of the terminal operator. 

2. Methodology 

Following a review of the command histories of the 9 ERIC/DIALOG 
terminal users being studied, one fairly representative search was selected. 
This particular search, shown in Figure 3, was of moderate length and com- 
plexity (nearly 50 commands) and makes use of nearly all of the DIALOG 
commands (i.e., EXPAND, SELECT, RECALL, PAGE, DISPUY SET HISTORY, DISPLAY 
ITKM. COMBINE using AND, OR and NOT logic and PRINT). The record^ elapsed 
time for this or.lginal search was 3A.22 minutes. (This figure, of course, 
does include time spent at the terminal thinking, reading displays, etc.) 

The chosen search was run repeatedly and continuously during the 
entire duration of the Lockheed system availability for seven days. This 
was done from 5:00 AM till 1:30 PM by four searchers on the following days: 
Thursday Oct. 18, Thursday Oct. 25, Friday Oct. 26, Monday Oct. 29, Tuesday 
Oct. 30, Wednesday Oct. 31, and Thursday Nov. 1. In this way, ^i^ta was 
collected for each day of the week and on one day (Thursday) for each of 
three consecutive weeks. 
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.'RECALL 5V 

"ADULT EDUCATION 

"E6 

•Rl 

«R4 

•RS 

#R7 

0 

«ADULT EDUCATION PROGRAMS 

a 

$1-5/+ 

"FEDERAL AID 

"E6 

0Hl 

«R3 

«R6 

•R7 

ffRl2 

"GOVERNMENT ROLE 

"GOVERNMENT ROLE 
«E6 

a 

$6*13 

"EVALUATION 

"E6 

«Ri 

«R16 

0 

0 

•R36 

•EVALUATION CRITERIA 

•FOLLOW UP STUDIES 

•REPORTS 

•ANNUAL REPORTS 

• l5-21/-<- 

$14*22 

X23 

0 

0 

&23/5 
& 

•SECONDARY 
•SECONDARY 
•POSTSECONDARY 
$24-t'2 5')'2 7 
$14*28 
$29-2 3 
630/5 



EDUCATION 
GRADES 

EDUCATION 



(SELECT File 1— ERIC) 

(RECALL a saved search) 

(EXPAND ADULT EDUCATION) 

(EXPAND line reference E6) 

(SELECT line reference Rl) 

(SELECT line reference RU) 

(SELECT line reference R5) 

(SELECT line reference R7) 

(PAGE forward in Expand display) 

(SELECT ADULT EDUCATION PRrjRAMS) 

(DISPLAY SET HISTORY) 

(COMBINE sets 1 to ^ with OR) 

(EXPAND FEDERAL AID) 

(EXPAND line reference £,'6) 

(SELECT line reference Rl) 

(SELECT line reference R3) 

(SELECT line reference R6) 

(SELECT line reference R7) 

(SELECT line reference R12) 

(EXPAND GOVERNMENT ROLE) 

(NULL command) 

(EXPAND GOVERNMENT BOLE) 

(SELECT line reference E6) 

(NULL command) 

(DISPLAY SET HISTORY) 

(COMBINE sets 7 to ^2 with OR) 

(COMBINE sets 6 and 13 with AND) 

(EXPAND EVALUATION) 

(EXPAND line reference e6) 

(SELECT line reference Rl) 

(SELECT line reference Rl6) 

(PAGE forward in Expand display) 

(PAGE forward in Expand display) 

(SELECT line reference R36) ^ 

(SELECT EVALUATION CRITERIA) 

(SELECT FOLLOW U° STUDIES) 

(SELECT REPORTS) 

(SELECT ANNUAL REPORTS) 

(COMBINE sets 15 to 21 with OR) 

(COMBINE sets Ih and 22 with AND') 

(DISPLAY set 23) 

(PAGE forward in Display) , 

(PAGE forward in Display) 

(PRINT set 23 in format 5) 

(Continue PRINT of set 23) 

(I5ELECT SECONDARY EDUCATION) 

(SELECT SECONDARY GHADKR) 

(SELECT POSTJ^ECONDARY EDUCATION) 

(COMBINE sets 2h, 25. and 2? with on) 

(CUMBINl!; aeta }h and ;>H witli AIJD) 

(DELETE all items in set '^^ from r.-^t :'[)) 

(PRINT set 30 in format t)) 

(END) < 
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Fig. 3* Query Used for the DIALOG Systeni Timing Exercise 



An attempt was made to run the search line-for-lixie exactly, as it 
was originally run except for two minor changes. PRINT format 6 yas used 
instead of format 2 in an effort to save paper'. Also. , the title, searcher, 
requestor and address were changed to suit the purposes of the experiment. 
(The title space was used to. record the start time of the search.) No time 
was spent reading displays or doing anything other then simply entering the 
appropriate command. Each cowiand was entered as soon as possible after 
the blue keyboard light and the "Enter" signal appeared. (There were some 
exceptions to this general rule as discussed in the later section on 
{possible sources of erjor.) The searches were done as fast a* possible 
without striining the searchers. Usually, one searcher completed two 
or three searches before another searcher took over, c;onsequedtly fatxgue 
and boredom probably did not significantly affect the results of the 
experiment. - . 

« 

Each, searcher recorded the ^real clock time at start and finish of 
each search as well as the elapsed titie indicated by the system at the 
end of each sear^. (The real elapsed time corresponds:^' with Che system 
report except for searches during which the system was down.) Notes were 
also made regarding any peculiar behavior on the part of the syfetom, when 
the system went down (i* it did), any significant interruptions of the 
\searcher*8 work by teleph<>ne calls ^r other distractions, and keyboard 
i^bs (if any) made by the sff«rcher. 

3. \possible Sources of Methodological Error 

Before discussing the results of thi.8 effort, a number of possible 
sources of error should be pointed out. 

On Thursday Oct. 25 the RECALL command was unavailable from 8:45 AM 
through the rest of the day. It was agreed that the searches would continue 
anyhow since the recalled search was not actually used in this search but 
merely called up and rejected. It was assumed that the amoiut of time it 
would take the system to respond with a display of the saved i-aich would 
not be very different from the time it would take the system to respond 
with the "Invalid command" message. Therefore, the sedrches were continued 
and the RECALL and PAGE commands were entered at the appropriate places in 
the sequence. * 

Another minor deviation from the ideal methodology was that one 
of the searchers did not wait for the "Enter" signalxto appear on the 
screen but started keyboarding the command as soon a^ the blue light 
indicated that the keyboard was available for use. (This blue light 
usually comes on for a few seconds then goes off and » comes back on again 
simultaneously with the "EnterV aignal.) The other three searchers 
waited till the light appealed the second time before starting to enter 
the commands. Probably the only place where this difference in method 
would affect searct^ time is when entering commands which include whole 
words which must be typed in. It was thought that the results of this 
discrepancy in metHiod of entering commands would be negligible and would 
merely tend to offset individual differences in typing speeds. 



Possibly the moat serious source of error lies in inconsistencies 
in the tolerated degree of divergence from Jthe original search. In gen- 
eral it was agreed among all searchers that if an inperfect cocomand wa^ 
accidentally entered early in the search such that the foUowing set 
uambers would not correspond with those of the original search, then that 
search should be aborted and a new one begun. But if an imperfect command 
was entered near the end of the search such that any resulting difference 
in elapsed time would be negligible, then the search should be completed 
and included in the data. Obviously this is a subjective judgment and 
may therefore vary both between individual searchers and for each seardher 
at different times. It has been assumed that the effects of such differ- 
ences in judgment will be averaged out in the results of the data. Thus, 
only one completed search was ignored~in the results pertaining to system 
response time. This was the first search done by one of the searchers 
which included a number of incorrect commands and resulted in a(i unusually 
high search time (25.17 min.). This search is included in laalyzing the 
data according to different searchers but not when analyzing the data 
according to differences in system response at different times of- the day 
since this search does not truly reflect,.a difference in system response 
t ime . 

Usually each searcher completed two or three searches before another 
searcher started. However, on Monday Oct 29, only one searcher was a/ail- 
able until 9:40 AM and therefore complete fifteen consecutive searches. 
The elapsed times recorded during that time were, however, consistently low 
so any effects of fatigue or boredom were apparently negligible. 

4. Res ults 

Table 1 in Appendix A shows the searcher, starting and ending times, 
and system-recorded elapsed time for each search completed during each of 
the seven days. This data was analyzed according to the response times of 
individual searchers, and system response times at different times of day 
and different days of the week. 

In Table 2 of Appendix A the data have been arranged according to 
searcher for each day searches were done. The cumulative mean elapsed time 
for each searcher from the first seiarch doue by that person on the first 
day through the .last, search done by that person on the last day is also 
indicated. The mean elapsed time and standar<l deviation of the sample are 
given for both the daily sample and the entire series of searches for each 
searcher . 

Figure 4 shows the pattern of search times for all searches done by 
each of the four searchers. The search times were plotted sequentially frsm 
the first search on the first day to the last search on the last day. In 
• these graphs, searches which overlapped with system down time were excluded. 
Figure 5 shows plots of the cumu ative mean elapsed times for each searcher; 
down- time searches were excluded here also. 

^^g^ In Figure 6 t}he curves for each day's searches are plotted, including 
"down-time searches (indicated as such). It can readily be seen that most 
of the peaks represent searches that were interrupted by system down time. 
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The next two graphs (Figure 7 and Figure 8) are enveld?ye--gr4phs spanning the 
shortest and longest search times for various times of day for the seven- 
day period. Figure 7 excludes down-^time and Figure 8 includes down-time 
searches. 

The data were separated into hourly periods such that all searches 
that started between 5:00 and 5:59 were grouped together, all those starting 
between 6:00 and 6:59 were grouped together, and so on. Figures 9 and 10 
plot the mean search time for each hourly group, with Figure 9 excluding, 
and Figure 10 including down-time searches. Figure 11 plots the mean search 
lengths for each day r. the week. 

5. Conclusions 

a) Influence of Search Operator's Keyboarding Skills 

Our searchers possessed a range of keyboarding skills. One searcher 
had almost no typing^ skills, while the rest of the searchers had at least an 
average typing skill. Although there was some individual variation in response 
time atbong searchers, this was considered slight enough not to bias the results 
of the experiment* Individu^ searchers were relatively consistent in their 
search times for the same question, seldom varying more than Jk2 minutes for a 
search of about 15 minutes average duration. 

There were systematic differences in the mechanical operating speeds 
of the four searchers, with overall average times of 13.93, 14-.03, 14.56, 
and 16.21 minutes* However, it does not appear that the mechanical skills 
of the terminal operator are a significant factor in explaining the time 
variances of the regular system users. A non-typist can do a search about 
as fast as a skilled typist. 

The cumulative mean elapsed times for each searcher fluctuated little 
after a learning curve of eight or nine searches. Individual searchers had 
a relatively rapid learning curve for this effort, with usually only the 
first few searches being significantly slower than the remaining searches. 

For this sample of one real query, the mechanical operations and 
system time accounted for 15 minutes out of the total recorded elapsed time 
of 34.22 minutes, providing some indication of the relative importance of 
the system speed versus the operator's cerebral speed* 

b) Influence of Hour of the Day 

The curves of response time according to the hour of the day were 
fairly level, with some tendency to peak at 10:00, 11:00 and 12:00* Using 
the mean search length for hourly periods, excluding down-time searches, 
these peaks, as shown in Fig. 9, were only two minutes longer than the 
shortest average hourly search length. Including down-time searches, the 
difference between the lowest hourly mean search length and the highest was 
slightly less than three minutes. From this it could be concluded that the 
variable of system response time at different times of day does not signifi- 
cantly affect the recorded elapsed times of searches. No data was available 
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Average Search Time for All Timing Searches Started at Various Times of the Day 

(Excluding Down-Time Searches) 
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»win the Lockheed records to show what the total volume of computer processing 
was during the time that our test runs were made. Thus we cannot directly 
correlate the system response time in our tests to the computer load at that 
same t itm • « 

^ ^ Average overall times of 14.1, 13.9, 14 J, 14.5, 14.1, 16.8, 15.8, 
15 ;8, and 14.0 minutes were measured for the one-^hour time periods beginning 
at 5:00 AM^ 6:00 AM^ to 1:00 PMy respectively. However » this difference 
amounted to about 3 minutes for this IS minutes search, and thus does, not 
appear to be a significant factor in explaining the time variances of the 
regular system users. A search will take approximately the same time to 
^complete regardless of what time of. the day it is run. 

However, a word of caution in accepting this conclusion seems called 
for: although \the elapsed time recorded for a do%m^time search may j not 
diltfer greatly from that of searches which did not include down time, the 
eltipsed real time may be quite long. For example, on October 30, a search 
'was begun at 10:37 and concluded at ll.:45. The recorded elapsed time was 
15.78 minutes; the real elapsed time was over an hour. Since for the 
purpose of this experiment only system response time was being measured 
exclusive of any operator ''think-time" or goof-off time, our searchers 
stayed by the terminal in such cases and completed the search as soon as 
possible after the system came back up. If instead the searchers had left 
the terminal during down time to attend to some other task^**as may often 
happen in real search situat ions-*-*and had returned some time after the 
system came back up (say, 15 minutes later), might this have noticeably 
affected the recorded elapsed time (e.g., added 15 minutes to it)? The 
answer to the question may be No, in which case down time may not signifi- 
cantly affect average search length. If the answer is Yes, then the ques- 
tion of whether the system tends to go down more frequently at certain 
times of.|;he day should be researched. (According to our small sample, the 
system tends to go down most frequently between 10:00 and 11:00 AN; the next 
most frequent time is between 7:30 and 8:00.) 

Since the overall monthly down time has been measured by Lockheed' *to 
range from ^ibout 2X to 5%, with the 2% figure being more typical, th|e overall 
effect of system down time on productivity should be negligible. Lockheed 
personnel stop the system clock as soon as an evidence of malfunction is seen, 
heiice system down time is almost completely eliminated from the reported 
search elapsed times. \ 

One purely subjective caution might also be added pertaining to the 
effects of system failure. One of the searchers stated that the system ''felt*' 
much faster during the first few hours of the day and during the last half 
hour or so. Although this difference is reflected in a very small amount of 
elapsed time (it most 2 or 3 minutes), it seems to make a greater psychological 
difference than the figures suggest. An operator may tend to get impatient 
during slow or down-time searches, and this may affect the operator's perform- 
ance at the terminal* For the searchers doing this experiment, there was no 
"cerebral time" involved. The commands were simply fed in verbatim, so we 
merely became bored or impatient during slow or down system time. But for 
"real" operators who have to think about what they're doing, slow or down 
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system time may result in impatience plus disrupted trains of thought. At 
any rate, there may be a significant difference betveen the performance of 
an operator using the terminal t^en its response is optimal and an operator 
using it when the response is sluggish. If so this might also affect the 
average search length of "real" searches. 

c ) Influence of Day of the Week 

There was some variation in mean elapsed times for different days of 
the week} but this does not appear to be significant. The difference between, 
the shortest mean elapsed time (13.5, Monday) and the longest (16.1, Wednesday)' 
is less than the difference between the shortest and longest mean elapsed times 
for the three Thursdays (13.5 and 16.6). 
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DETAILED EXAMINATION OF ERIC SEARCHES 

1. • The Source Data 

* i> ' •••• • •• • 

The DIALOG software has a capabiLity to provide a computer log 
\^ich records each DIALOG command presented for process ing» the identi- 
fication of the terminal which submitted it» and the date and time (to 
the nearest second) when it was executed at the central computer* This 
continuous chronological log of the total operations for the nine ter- 
minals in t^is study was specially prepared to enable us to do a post- 
audit to trace the precise sequence of commands carried out by each of the 
nine individual installations. Lockheed pLrovided this selective cdmmand 
log on magnetic tape for 15 selected days of operation during the Fall of 
1973. All of the^ 15 days were ones in which there were no problems with 
the performance of hardware, software, or communication^ equipment, i.e., 
DIALOG had no down time ou any of these days. Since timing data was an 
important part of our analysis it was essential that data be collected 
from trouble-free days only. The 15 days studied were October 11, 17, 22, 
23, and November I, 5, 6, 9, 14, 15, 19, 21, 23, 26, and 27. 

2. Definition of Search and Question 

First, it is important to understand the distinction between the 
term -'search*' and "question". We have followed the same convention 
regarding the meaning of these terms as has Lockheed in its periodic 
activity reports in the DIALOG/CHRONOLOG. The start of a new search is 
indicated only by submission of a DIALOG "BEGIN" command. A search is 
considered to be terminated only by encountering in the trace history 
another BEGIN command or a system-generated message that says I/O SUBTASK 
TERMINATED (indicating that the terminal had signed off) or a system- 
generated message (e.g., beginning with the word DIALOG). A question, 
on the other hand, is considered to be cumpleted if any the above 
conditions occur or if a DIALOG "END" ccvmrnand is submitted. It is 
common practice for a DIALOG user to commence operations at the terminal 
with a BEGIN command and then proceed to submit several different logical 
queries or questions, each terminated by an END command, before submitting 
another BEGIN. Such a sequence would be interpreted to be one SEARCH but 
several QUESTIONS. The sequence of commands shown in Figure 12 provide a 
further illustration of this definition. 

The data tape initially had the following characteristics: 

772 searches 

1,129 questions 

281 terminal-hours of connect time, 

A few command sequences which would be QUESTIONS by the preceding 
definition but which appeared to represent housekeeping functions or after- 
thoughts were deleted (manually) from the trace history log. These were 
generally single commands between END commands (e.g., messages, a single 
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SELECT with no output function, a single PRINT with no logic function, 
a BEGIN-END combination with nothing between)* \ 

After this manual editing of the tape the data tape had the folloj^ing 
char xcteristics: 

730 searches 

1,011 questions ^ 

239 terminal-hours of operation. 

This data forms the basis for the detailed examination of ERIC searches. 

This continuous log on mag tic tape was processed .by a sequence, of 
computer programs written by ILR to produce the tables given in this chapter. 

The timing data was gathered by subtracting the time at which a 
given command was executed by the central computer from t;he time at which 
the Subsequent command was executed- The resulting figure represents think 
time, keying time, execution time for the first command, transmission time 
back to the searcher, searcher and system transmission time for the next 
command to the certtral computer. The result of scanning the computer log 
provides reliable timing data on command use, with one exception.^ It is 
frequently observed that after termination of a search by an END*'command , 
a searcher may get up from the terminal and take a brt^ak, coming back to 
resume searching after 10 or 15 minutes. In such cases, the 10 or 15 
minutes would be counted by our program as having been associated with the 
END command, whereas the time may really have been a between-search pause, 

Theoretically, there 'Should be a trivial amount of time associated 
with the END command. The data analysis programs were run to count the 
END command time, but the results were subsequently edited to delete the 
END times and re-distribute the percentages and totals to the amounts now 
shown in the tables in this section. 

One paLv of the data reduction effort included a count of the number 
of logical operators (AND, OR, NOT) in each question and search. Because 
the DIALOG system permits a single COMBINE command to perform several 
logical operations, it was necessary to locate each of these instances and 
expand the command to get the proper count of logical operators. For example, 
the command 

$1-4/* (i.e., set 1 AND set 2 AND set 3 AND set 4) 

is equivalent to 
$1*2*3*4 

rind has been counted as three AND operators. 
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Similarly, the command 

$7-9/+ (i.e., set 7 OR set 8 OR set 9) 

is equivalent to 

$7+8+9 

and such a command has been counted as having two OR operators. 
3. Findings 
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Table 5 provides summary data (by terminal) for the 15 full days 
of operation. This includes gro/ss data about the number of questions 
and searches from each terminal; a tally o^ the number of times that each 
logical operator was used with ^ach termitial, and combinations of these 
parameters. There are some -widie teTminal-Co-terminal variations in these 
figures, and they are summarized in thi^s table. Notice the Approximately 
3:1 ratio in the use per question of the AND, 5rl ratio in the use per 
question of OR, and 32:1 ratio/ in the use per qae8.t.ion of the NOT operator. 

The difference in use (^f ANDs may reflect different approaches 
to retrieval for the same quejbtion. Most of the installations used between 
1 and 3 ANDs per question, with an average of just over 2/ kSDs per question. 
This figure is a little lowe^ than that reported in 1969 by Roger Summit 
for the RECON system, where the average number of Booleah intersections as 
used by RECON aearchers was 2.30. The data given in 1972 in a related study 
by Martha Williams 2 of the liumber of ANDs used in 126 SPI profiles running 
on the IITRI computer- based/ current awareness system l^ads to an estimate 
of 2.37 ANDs per profile. With the assumption that an/ average on-line 
search in our study is equivalent to an average ptof ile in the IITRI study, 
then this would be 2.37 ANDs per question. 

The variation in nJmber of ORa per question (a range of about 2-11 
ORs per question) probabli reflects some searching/installations' practice 
SELECTing a range of temk from, a display with one- SELECT command. Terms 
thus SEtECTed are automatically caed together, and thus fewer ORs are required 
to be keyboarded to establish the OR relationship^ For a further discussion 
of this point, see the l/ater chapter on Search Guidelines. The data in the 
Williams article leads ^o an estimate of 3.01 ORs per question for their 
situation. j 

The great variat^ion in NOTs used per question (.02-. 64 NOTs per 
question) is probably i^ot statistically significant due to the infrequency 
of occurrence in this feraall sample (only 154 times in over 1000 questions). 
The data in the Williams report leads to an estimate of .47 NOTs per ^jrofxle. 
A 1971 report by James Carmon^of the University of Georgia experience m 
computer-based current awareness services noted that, "The six profile 
batches range from 3% to IIX of the profiles which use terms with NOT lo^ic. 
Use of the NOT logic would seem to be related also to the particular data 
base being searched. 
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Table 6 provides; a summary distribution of the number and relative . 
percent of DIALOG commands used by each terminal. There are differences 
here in each terminal's use of these commands. During the entire 15 day 
period, the KEEP command, for example, was used by only two ina.tallations 
(one of which used it once), the RELEASE command was used only five times 
by one terminal, and nobody used the EXPLAIN command. Some the commands 
are equipment-dependent (e.g., DISPLAY is not. generally use# without a CRT 
terminal and TYPE is recommended for use by all dialup tenpinals) so that 
the percentage distribution of command types reflects thU factor also. The 
range of command use f6r the terminals is summarized in this table. One 
obvious difference is the relatively large number of TYpE commands used by 
the slow speed dialup printing terminals, and the 4>rop0'rtionaIIy large 
number of EXPAND commands u^ed by the CRT display terminals. TYPE commands 
are generally used by non-CRT (hardcopy) terminals instead of DISPLAY. As 
expected, CRT terminals used many more DISPLAYS and hardcopy terminals 
(3, 88, 114, 125) used many more TYPES. The high st>eed terminals would 
also seem to encourage the use of, the high data transfer commaind^..JA»eh-irS 
EXPAND and DISPLAY. ' ^^.-^-^'^ - 

Table 7 provide the same type of data as Table 6 except that the 
numbers and percentages are in terms of the time used by each of the commands. 
The completed time was the total elapsed time from the' receipt of the command 
by the computer until the receipt of the next command. Table 7 is probably 
the most important data from the 15-day test records because the data is 
related directly to the terminal time used. Here we see several installa- 
tions using considerably more of their terminal time for output functions 
(DISPLAY, TYPE, PAGE). 

L'our of the installations, all with mechanical, hardcopy non-CRT 
terminals, used approximately one quavter of their time on the TYPE command. 
It seems a pdssihility that some searchers use the terminal primarily to 
negotiate and arrange' for a printout, while other, searchers put more emphasis 
on the terminal being the actual output device. This point cannot be resolved 
without examining the nature and volume of printout requested by each of the 
terminals . 

Another possible explanation for the large amount of time (propor- 
tionally) spent by the slow speed terminals on the output commands is that 
these commands involve a large amount of data transmission time. For example, 
a single TYPE command (assuming an average of 600 characters per item) would 
require 1.3 seconds on the 480 character per second terminal, and 20 seconds 
(and a correspondingly higher percentage of time used) on the 30 character 
per second terminal. Similarly an EXPAND command (assuming an average of 
800 characters) would require 1.7 seconds on the 480 character per second 
terminal, and 27 seconds on the 30 character per second terminal. This is 
discussed further in a later section which explores the functional utiliza- 
tion of commands fcr each of the terminals. 

Table 8 provides the data in terms of the number and percent of 
DIALOG commands used per question. This same data is plotted in Figure 13. 
The dominance of output or display conmands for some installations can also 
be seen in this data. This table also shows the range of commands per 
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question searched by the installations. For the 15-day period there was an 
average of 24*76 commands per question for all terminals, although the indi- 
vidual terminal averages ranged from 14.60 to 51.92 commands per question. 

Table 9 also provides data for each installation regarding command 
utilization per question. This d9ta is in terms of terminal time used foi 
each command. The average search time for this 15-day sample ranged from 
5«5 to 29.0 minutes per question , with an average of 14.2 minutes per 
question for all searches done over this 15-day period. 

Table 10 and Figure 14 provide data for each installation regarding 
command utilization per search, in terms of^the number of commands used. 
Table 11 provides the same type of data expressed in terms of terminal 
time used* ^ ^ 

Table 7 provides data fpr the total amount of terminal time used 
by each installation during this 15-day period (excluding the time asso- 
ciated with the END command). Table 6 provides data for the total number 
of command^ used by each installation for this terminal time. This data 
assembled together in Table 12 provides another measure of the terminal 
activity at each installation (i.e., the rate at which commands are exe- 
cuted at the terminal). For the set of installations studied for the 
entire 15-day period, an average of about 85 commands per hour were 
entered at the slow speed hard copy terminals, and an average of about 
112 commands per hour were entered at the fast tierminal installations 
(with high s^i^eed CRT equipment). 
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D. CLASSIFICATION OF ERIC QUESTIONS ^ 

I . Class i f icat ion Rules * 

In an attempt to explain differences in the rate of processing 
questions among the various installations, one suggested explanation was 
that perhaps some installations might be running a lot of "simple'' ques- - 
tions while other instal list ions are running "complex" and hence more 
time-consuming questions. To study this possibility, an algorithm was 
devised whict classifies or grades queries submitted for on-line ERIG|||^ 
DIALOG processing according to logical complexity. By applying the 
algorithm to all of the 1,011 quest ions^s^ 

installations during the 15 days of operation under investigation;, a 
measure was obtained of the mix of question types submitted by a given 
terminal. It was felt that a comparison of these question mixes Would^ 
be helpful in understanding why certain organizations processe4' more 
questions per hour thau others. / ' 

The algorithm is intended to assign to any given question a 
rating of "simple", "moderate", or "comp^lex" that is consistent with the 
judgment of logical complexity that might be made by persons experienced 
with automated information retrieval systems. Clearly many queries could 
reasonauly be considered to be either of two neighboring categories* How- 
ever, for the purposes of com aring general trends among several installa^ 
tions it was ffelt to be sufficient that the algorithm be consistent, and 
also assign a rating of logical complexity that agrees in a high percentage 
of cases with that of human judgment » 

The classification algorithm takes into consideration several para- 
meters of the search query. These are the totaj- number of. 

. DIALOG commands (N) 

. SELECT commands (S) > 

. COMBINE commands (C) 

. logical operators (L) 

. AND operators (A). 

Three different aspects of a query are considered by the algorithm: 
the total numi^er of DIALOG commands, the number of SELECTS, and the apparent 
complexity of the query log is<: . A rating 6t "simple", "moderate", or "com- 
plex" is assigned independently for each/of these three aspects of a question 
It trti^ same rating is assigned to two or three of these aspects, that r5[ting 
becomes thtr- rating for the entire question. If, each of the three aspects 
is assigned a different rating, Chen the entire question is judged to be 
••moderate-". All other possibilities are judged to be "simple". 
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The rating assigned to the first aspect (total number of DIALOG 
coinmands) is made as follows. The total number c: DIALOG coimnands i6 counted 
but in 30 doing only 1/3 of the DISPLAY, TYPE, PRINT, and PAGE commands are 
counted. These commands are given less weighs because they represent output 
functions rather than search strategy functions. The resulting "N" total 
determines the command rating as follows: 

N Rating 

^ 15 simple 

15-30 mddeV a t e ^ 
> 30 complex * 

The rating assigned to tHe second aspect (total number of SELECTS) 

is made by simply counting the number of SELECT commands, "S", in the ques- 

tion and applying this rule: 

S Rating 

^ 9 simple 

9-14 moderate' 

^ 14 complex. 

The rating assigned to the third and final aspect (complexity of 
search logic) is made by applying the rules indicated below, ("A" is 

the number of AND operators, ''C the number of COMBINE commands, ''L" the 

number of logical operators, "S" the number of SELECTS.) 

1. If A ^3, and S>12, then the question is judged to be 
complex. 

2. Otherwise apply the following decision chart. \ 
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In evalitatinj^ and calibrating this algorithm, two staff members 
made independent "simple**, ''moderate**, or ''complex*' 'judgments of approxi- 
mately 125 of the quest ions processed during the 1'5 days under study. 
The ratings assigned by the algorithm to these questions agree* with the 
composite judgment of the staff members as consistently (85-90%) as the 
individual staff members agreed among themselves. This suggests that the 
algorithm provides a consistent rating approximately equivalent to that 
which would be obtained by a manual examination of each question. 

As a point for possible improvement of the algorichm-, it was noted 
after all of the work had been done that we had\^nderestimated the number 
of OR operations in some of the questions. We unfortunately did not examine 
and compute an equivalent number of OR operations for those searches that 
used a SELECT range. Our algorithm did not recognize SELECT El-E6>E8,El0 
as implying 7 ORs instead of none. The terminal that used such a composite 
command w-i s undercounted in the number of SELECT commands aryi ORs compared 
to those tliat would have been counted if the searcher had SELECTed terms 
individually and then COMBINEd them later. in theoe cases the number of 
SELECT comnands does not correspond to the number of descriptors used. 
(Another case is- the use of SEARCH SAVEs, which commonly contain many 
descriptors and appropriate logical operators, rc^trieved as one set.) 

\ 
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2. The Data 



The results of applying t. algorithm to all questions submitted 
by nine terminals during the 15 d.^vs under study are shown In Figure 15. 
It can be seen that there is a con;jiderable difference in the mix of 
question complexity associated with each of the installations. One in- 
stallation had a high of about 35% complex questions, while another in- 
stallation had less than 4% complex questions. Several aspects of this 
issue are discussed below. 



a. 



Search Time as a Function of Question Complexity 



Intuitively, one would expect that complex questions would take more . 
on-line time than simple questions. The data from this study tends to 
supports that safe hunch. Figure 16 shows ojn a terminal-by-terminal. basis , 
how the average search rate correlated with the percent of simple j?iiest ions 
processed by that terminal. The percentage of simple questions processed 
by each terminal was taken from the data in this section. The search rates 
used in this figure were the rates experienced for these terminals during 
the same general period (October-November 1973) that this question complex- 
ity data was drawn from. It can be seen from this figure that there is some 
slight correlation between these two factors, but not as pronounced as one 
might expect . * 

/ 

b. Question Complexity as a Function of the Installatica 

One might suggest that the question mix might be influenced by the 
particular type of installation, sponsoring organization, constituency, or 
user group that is being served by the terminal installation. Unfortunate- 
ly, this study did not collect any data that could be used to investigate 
this question. We do know that the installations were serving different 
types of user groups. ^ 

c. Question Complexity as a Functioi/of the Terminal Equipment 

It seemed possible that the question mix might be related to the 
type of terminal equipment, for the reiison that a terminal operator might 
be more inclined to use more EXPAND and DISPLAY commands if they could be 
swiftly executed. Figure 16 provides some data on this point, and shows 
that there does not appear to be any strong corrielation on this point. 

d. Question Mix as a Function of Personal Work Habits of the Analysts 

In the work of classification and indexing, it has been known 
for y^ars that there are differences in the approach and result? when two 
or more people do'the same ;t ask. Even the same indexer repeating a given 
task at a later date may be inconsistent in the assignment of indexing terms 
The reports of many indexer consistency tests have made this point. We 
how /have an analogous situation in which'it seems quite likely that two or 
more profile analysts or tennincl operators, given the same information 
request and cpnditions, will generate different search statements. It also 
seems possible that given the same information request, one person could 
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16. Rela'tionnhip of Search Rates to Search Conifjl«'x i t y 



come up with a simple search while ai^other person came up with a complex 
search (and they might even get approlcimately the same results). We have 
seen instances in which this has happened. We have talked to analysts who 
readily agree that they always try to make their search as compre^iensive 
as possible, and we have also talked to analyses who make a point of always 
trying to make a search as simple as possible in order to extract a few of 
the most relevant citations. 

Thus we see a possible pattern in which question complexity is a 
matter of personal style and work habits, or personal approach to a problem 
(or perhaps maybe even a matter of institutional style or policy). This is 
a significant factor for this study because most of the terminal installa- 
tions use only a few terminal operators (typically the bulk of searching 
is done by 1-3 different individuals at each site), consequently, the 
pattern of a single operator can in fact be the pattern for the installa- 
tion. 

No data was collected during this study that could be used to support 
or reject this hypothesis. However, this point of view was confirmed in 
many of our discussions with terminal operators frora other ..installations . 
The data from the recent BIOSIS test (26 terming' installations running the 
same 20 questions) should provide some very gooa nformation on this iJoint. 
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£• VARIANT FORMS OF ERIC DESCRIPTORS AND IDENTIFIERS 



1 • Background 

The EXPAND command results in a display such as that shown in Figure 
17 that is a mixture of both Descriptors and Identifiers, The 7,520 Descrip- 
tors are controlled terms from the ERIC Thesaurus, have. a rigid authority 
conttol procedure associated with their input, and are seldom in error. The 
Identifiers are not subjected to the same quality control and review proce-. 
dures as the Descriptors, and this has resulted in a considerable degree of 
inconsistency and error. However, efforts are underway to standardize use 
of Identifiers. 

Because of these differences and for other reasons, the Descriptor 
•nd Identifier files were one time separated in the DIALOG system. 
However, for at least the i.ast year they have been combined in a single 
file so that a niixed ."collect ion of Descriptors and Identifiers are dis- 
played as a result of the EXPAND co(rtnand/. 

During our early use of the DIALOG system we became aware of many 
instances in which the EXPAND command would show one or more variant forms 
of the same word (e.g., both the singular and plural forms of the same 
word). Because the variant forms Occurred so frequently, it was felt that 
perhaps some conscientious searchers would anticipate their occurrence and 
would use more EXPAND and SELECT commands than other searchers, and that 
this might contribute to an increase in the average length of the searches, 
if done consistently. We were also concerned with the retrieval loss that 
might be experienced by not including in the search statement all of the 
variant forms of words. For these reasons, we decided to explore in more 
detail the nature and frequency of the variant forms, and their impact on 
the search process and results. 

2. Nature and Extent of Variant Forms 

An analysis of the printed ERIC term posting frequencies would have 
provided some useful information ab l the frequency of occurrence of 
variant forms, but would not have led directly to information about their 
impact on searching. For that reason, it was decided instead to study a 
number of representative real searches that had been done by other installa- 
tions. Using the command histories provided .by Lockheed for the nine ter- 
minals under study, a total of ^0 searches were sampled randomly from three' 
days of DIALOG operation. As described earlier, a search was defined as any 
one discrete command history with a BEGIN and END command, and usually con- 
sisted of one or more EXPAND- . ^ELECT, and COMBINE commands with or without 
a PRINT command. ^. 

Each term thaT was SELECTed by the searcher in these 80 searches 
(e.g., IT=AIp,. I7*SUMMER SCHOOL) was looked up in the July 1973 issue of 
ERIC/DIALOG Cumulative Listing of Descriptor and Identifier Usage in RIE 
and CUE to see if there were any variant forms of this term, and to see 
to v^at extent the searchers picked up the variant forms . 



83 



I 

UJ 

•'.••'I 



li. 
D 



UJ 

■_J 
UJ 

UJ 

(a 











































Z 










•r ixi 




















T 




















□ UJ 










ll f L 


1 




1 


1 


•-4 V 




u 










a 






■r 










o 




^ u 








u. 
•I 

UJ 

•VTl 

UJ 



a 



UJ 

_j 

UJ 

»■"!• 

n 
-J 



14: 

n 
I.. 

-.J 
I* *• 

M 

Cu 
LU 




I 

ijj 







r. 


































t: 






in 








o 




jVi 












Tl- 








LU 






































IT 






































lU 






































0 




1 


1 




\ 




1 




1 








1 




1 


1 

U. 




1 






u. 


1 




I 




Ui 




LU 




It 








U 




(_ \ 






a 






> .«* 




-.J 








n 




n 




o 


a 










n 












•I 






















■ » 
















r 


• 




« 




« 


r' ■ 




n 




■ 


»- • 


• 


■ 


li 


« 




■ 




* 




• 




■ 




• K 


• 


















• 




•X 


> 


1. 1 




















lU 




1—* 


iX 








Ct 
























Q r 






iji 




UJ 


























3' 


_j 








^1 




3** 












^1 











r 

LU 
J— 
1 

Lj 



UJ 

r -J 

it' U! 
C3 »/• 

u. z: 
i-i «x 

.J 

•r c :i 

II -) 

I/* I 

Cl 



•X 

lu »-* 

r -I r 

a Ixi ct 

□ »£' a 

Ll r 

I— i iT 

J -J 

ix 

O O 

II .J II 

l>'» </• 

Cl Cl 



r 

Ut J[t 

r u. 
J 

»/• «x 
□ 

-J ti 



I'/i or •>*** "X 

UI •"^ LU •-^ 

. J z: -I r 
lu tii a: 
□ ix» a 

Z Ll Ll 

ll 1-4 QC 



□ o □ c» 
-i II -1 II 



(/• X 
LU ^ 

u r 
u« a. 
13 a 
r u 
tX ^ 
•J 
i a: 
a u 
.J II 

Cl 



iV) iX 
LU »H UJ 
^ Z . J 
iu O:: UJ 
ID CD j: 
r u r 

•X H4 iX 

to iI 

Cl o a 
. J II -J 

LL 



X 0 > CP 
f-^ UI 

r 

Ck. UI 

n ^3 
u zi 
x, 
^1 J 
X • • x 
o n c« 
II _i II 

iVl iV'l 



Cl 

a 



i/i 
LU 

U! 

IX« 
X 

a 



LU '•-^ 
Cl Ut 



LU 



LU 



Ui 



UJ 



UJ 



b 



FRIC 



31 



■ ECRLIFORNIR UNIVEC'SITY (LDi RH'5ELE: ' 
PEF IMIiEX-TEPM TvPE ITEMS PT 

El IKRLIFOPNTR TEST OF 

PEPtONRLirv <CTP> =• 

E£ ITsCRLIFDPrUR gMIVEPSITV 1£ 
E3 IT'CRLIFOPNIR ^ 
UNIVEPEITY 'iBEPKELEY)- 4 ^ 

E4 IT-CRLIFQPNIR / 

IJNIVER'SITY ':DRVn:> 1 

E5 ITsCRLIFOPNIR 

UHIVERSITV -riRVIN^?:' 1 

E6 -IT=CRLIFDRNIR UHIVEP5ITY 

.-LOS RNG£LES> 

E7 • I T»CRL I FDPN I R I VEP S I TY 

vSRN DIEGO CEKTPR 1 

E8 IT=:C.RLIFDPN1R JHIVEPSITY 

-: SflN I»IEGa> cl ^ 

E9 ITsCflLIFOPMIR UrUVEPJITY 

.: SRN FPRMCISC3> I 

Elo it=i:rlic^qphir UHIVEPSITY 

RT EEPkFLE'-* " 1 

-MO-^E- 

PRGE 

PEF IMHEX-TEPM TYPE ITEM: PT 

Ell IT^CRLIFQC'MIR UMIVEPiITY 

RT LDS RNGELES 

El£ n?teCRLIFORNIR U-UVEPSITY 

RT DIEGO ^ 

£13 IT=CRuIFOc;'NIR UNIVEPSTTY 

RT 'RNTR BRPBR'^'R — < I 

El 4 IT=CRi_IFaRlSIR U^HVEPSITY 

RT SRNTR CRU2 1 

E15 IT=CRLIFDRHIR 

UM I VERS I TY BERKELEY 



9/ 



E16 


I T«CRL I FDPN I R UN I VEP I T Y 






MEDl'CRL OTP SRM 


1 




* "* EUhlVEPSITY 


DF Cl 


PEF 


INDEX- TERM TYPE 


ITEM? 


El 


IT=UMIVEPi ITY UR ■ 






ERIDGEPOPT 




E£ 


IT=UhIVEP:.ITY OF 






BRITISH COLUMBIR 


7 


E3 


IT^UNIVEPSITY DF BUFFRLJ 




E4 


IT=UNIVERSITY 






BUFFRLO <M£Ui YOFK < 


1 


E5 


IT=IJNIVEPSITY DF CRLG^'^Y 


CL 


E6 


~I,T=tlHIVEPSITY CF 






.. CRLIFDPNIR 




E7 


..1T=|JHIVEPSITV DF 






CRL I FDPH I R (• EEPi' ELE Y > - 


1 


E8 


IT=UNIVERSIT''' DF 






CRLIFOPNIR <IPVIHE>— - 


1 




ITi^UHIVEPS ITY DF CRL I FOR 






NIR aOS RNGtLES':- 


1 1 



Fig. 17b. EXPAND Showing Corporate Authors (Identifiers) 
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Variant forms were identified in this sample as ;being of the 
following types: 

— MORPHOLOGICAL 

♦ singular vs^ plural form (TEST, TESTS) 

. gerund form (TESTING) 

. possessive forms (BLOOM TAXONOMY. BLOOMS TAXONOMY) 

— SPELLING FORM 

. English vs. American forms (LABOUR, LABOR) 

. Acronyms (IGE, INDIVIDUALLY GUIDED EDUCATION) 

. Abbreviations (CAL., CALIF., CALIFORNIA) 

A • Compound nouns ^ith or without space or hyphen 

(POST SECONDARY EDUCATION, POST-SECONDARY EDUCATION, 
POSTSECONDARY EDUCATION; FILM STRIP, FILMSTRIP) 

~ SPELLING ERRORS (COUNSELING GOALS, C0UN5ELING. FOALS)* 

One extreme example of variant entries is Title III of the Elementary and 
Secondary Education Act which is listed in 17 different ways. Our analysis 
work also considered some words or variant forms that are often used in a 
synonymous way (e*g*, CHICANO, MEXICAN-AMERICAN) • 

Our analysis of all of he 80 command- histories resulted in the data 
shown in Table 13 • Several observations can be made regarding the data in 
this table: 

♦ There are many variant forms in this data base* Comprehensive 
searching would have included 123 variant f rms in a total of 764 
terms (16.1%). About one out of every 6 terms had a variant form 
which could have been added to the EXPAND and/or SELECT operation 
if the searcher desired* 

• The- searchers did not use many variant forms in their searches* 
Consequently, this probably did not significantly influence the 
time required to do the searches. Furthermore, the searcher ys use 
of variant forms was distributed rather evenly over all tU.e ter-* ^ , ' 
minals so that the inclusion of variant forms in the search state/ 
ment probably did not contribute .significantly to , the differetrc^ 

in average search speeds between the terminals. 

A more detailed analysis of the variant forms encountered in this study led' 
to the data shown in Table 14- The data from Tal>le ; 13 and 14 suggest a need 
for some quality control improvements'^ in the data base. However i we did 
check further to see what impact the use or non-use of these variant farms 
would have on the search results* 
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TAB^ 13 



EXTENT OF VARIANT FORMS USED OR NOT USED IN 80 REAL SEARCHES 

^ 



Oct. 30 Oct. 31 Nov. 1 
' . : , . Searches Searches ' Searches 

Total number of searches examined 21" 39 - 20 

Total number of terms originally iT6 370 11^ 

' used in these searches (i.e. 
SELECT) 

Total number of sfariant terms 9 . - 10 ^ . 

actually used by the original 
searchers 

Total number of additional 30 55 19 

variant terms found by our 
lookups , that could also have . - ' . 

been used, by the original 
searchers but were not 

Total n^ber of variant forms 39 65 19 

that .could have been used 



TABLE Ik ^ 

Characteristics of variant' forms used oe(.not used 

IN 80 REAL searches 



T ypes of Variant Forms ' 
Singular 
Plural 
Gerund 
Possessive 
Spelling 

Spacing and hyphenation 
Acronyms and abbreviations 
Errors 



Number of 
Oc currencies 

3k 
15 

1 

2 

3 
12 

5 

19 



Number of 
Items 

69 

191"^ . 
11 
2 

1 ^ 

\ lis 

\ ' 
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3. Impact of Variant Forms on Search Output ^ 

For this part of the study, nine of the previous 80 searches were 
chosen for further analysis because they frequently used descriptors that 
had variant forms. In order to see what effect tiie variant forms had on 
the search results, the search steps of the original searches v^re re- 
created, followed by another search that incorporated all of- tmi possible 
variant forms in the place where they would have been used in the original 
search* The number of output citations was noted after each COMBINE "opera^ 
tion, -In both the original and augmented . searches . No attempt waa made to 
judge the releVance of the selected citations* The results of these searches 
are shown in Table 1^. ^ 

The data from this study seem to indicate that although there are a 
significant number of variant forms of subject terms thac could be incor* 
porated into the searches, the' addition of these variant forms to the 
searches does not significantly affect the search results in terms of re- 
trieving a a ^arge^ number of additional citations, /tn sevea of the nine 
sample search^p , .the results stayed the same when;>a total of 28 variant 
forms was a<j(ded \to ,the original isearches. In th6 remaining two sample 
searches (sej|rchfes 8,9 in Table 15) a total of;'seven additional citations 
was added tfo ithe »original 450 citations as a .result of the inclusion of 
12 additiohal variant forms,, for an increase of about 1.5 percent of the. 
f /original citations for those tyo searches. 

" As guidelines for the SQj^rchers, the data would suggest that if ^ 
the most impox^tant Descrip^^ts /and Id^tntifiers were used the search, 
the redundancy"^<>f, indexirig^is such thft the lookup and inclusion of every 
possible variant form of de^criptcr tfir! identifier may not be necessary 
unless the highest possibl^^'recall i^'^axi objective of the search. One 
major exception to this practice is the handling of variant forms of author 
names and institution names* There are mauy variant forms for these names, 
and they should be EXPANDed and includec^ in aH variant forms • 

It is planned that in January 1975 a complete revision of the ERIC/ 
DIALOG data base will be made available. The new data- base will be offered 
with the same powerful full-text ir^dexing techniquet^ currently available on 
all other DIALOG files. Full-text indexing will include the title. Descrip- 
tor, Identifier and corporate aurhor fields • The seaicher will be able to 
retrieve the bound Descriptor and Identifier phrases as done now, but in 
addition, the searcht.- will be able to locate any word p^tterti, including 
word distance and order, contained in any combination of the full-text 
indexed fields. Full-' :xt SELECT operations allow the specific^^tion of 
inter-^^^ord distances at the word, sentence field or citation levels in' 
any combination. This facility will greatly simplify the process of col- , 
lecting word xorm variations as well as synonyms. For example, by SELECT- 
ing the term READING/DE, ID the searcher will immediately obtain all uses 
of the word READING in any Descriptor or Identifier regardless of its word 
position. Thus postings to hundreds of ERIC Descriptors and Ic^entifiers 
will have been precombined for the searcher. 
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F.. INFLUENCE OF OTHER FACTORS ON SEARCH SPEEDS 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 45. 

*v * . 

!• Terroinal Equipment 

In theory^ the type and speed of terminal equipment definitely 
influences the search strategy^ the command utilization^ .and . the pro- 
ductivity of the terminal installation; that relationship is not clearly 
borne out in the test data, the data in Table 12 clearly shows tfiat for 
the 15 day detailed sample,,, the fast CRT ^displays , as a group^ execute 
about 1,3 times as maAy commands per terminal honr than the slow speed 
terminals do^ and about 1.2 times as many questions per hour • However 
over the entire span of terminal operation described in Table 1 and 
Figure a mixed trend is seen -the slow speed mechanical terminals » 
as a groups seemed to run more searches per hour than the fast CRT 
terminals » especially for the last third of the period that is shown 
in Figure 1* It would seem th^t the data does show that the type and 
speed of the tetmiiial equipment is in fact a significant factor in 
explaining some but not all of the differences in search productivity 
for the installations studied, , 

Table 16 does ^ow that there are clear differences in command 
utilization by terminal type. Both types of terminals xised about the same 
percent of their commands for query formulation and negotiation (in the 
range of about 34-69% of all commands used). However the slow mechanical 
terminals used a greater percent of their commands for output functions 
(about 23-33%) than was used by the fast CRT tprniinal install2«-^'*"^r 
13-22%). . ' • 

A greater percent- of TYPE commands was used with ihe slow terminals^ 
in comparison to the equivalent DISPLAY command for the fast CRT 
terminals. Probably this was because the slow terminals had hard copy 
output as a result of search negotiation operations that could also be 
used for immediate search results, particularly for searches resulting 
in a small number of citations. 

2. Continuing Education, and Association with other Searchers 

It seems quite possible that a searcher who wa^ isolated from 
other searchers would not continue to develop the searching skills and 
perfomtance that might otherwise be possibly A searcher working with 
a large group of other searchers within the stme institution, would be 
in a position to share ideas and techniques do gradually upgrade the 
performance of the entire group of searchers; Similarly^ participation 
in user groups, continuing training by representatives of the on-line 
service, and site visits to other terminal installations, would all seem 
to be positive influences in upgrading searcher performance. It is quitd 
possible that some of the installations included in this study operated 
with a very small staff of searchers (e.g. 1-3), and were relatively limited 
in thb extent to which they could take advantage of these opportunities 
for continuing education and training. This factor might explain some 
of the differences in installation product iviLy_._ 
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The DIALOG User^ Group met for the- first time in early 1973* The ERIC 
U$er3 Group also met for the. first time in 1973. It is not clear what type 
ot person attended these meetings (e.g. managers instead of searchers) , and 
it might be that these meetings did not contribute significantly to the 
terminal performance that we measured for 1973. During this same time period » 
DIALOG representatives were visiting each of the terminal installations and 
answering questions, but did not have a formal program of continuing .education. 
The instruction manuals, newsletters, and other documentation materials were 
not as well developed as they are today, and may not have been a factor in' 
improving t>e terminal performance in 1972-73- However, it should be noted 
that many of the installations have shown a cpntinually improving performance 
picture as the documentation and user coimnuni^ation channels improved* 

What we can say about these coftmunicat ion factors is that we feel 
that they: can influence terminal productivi^, however we have no direct 
data from this study to support that feeling-. 

3. Subject Expertise 

It would seem reasonable to expect higher performance from searchers 
who were subject specialifets in the topics being searched. One would also 
expect that searchers at the ERIC clearinghouses\would be particularly 
proficient because they knew the data base and the indexing terminology. 
One of our test installations that was an ERIC clearinghouse did in fact 
have a high search rate. No data was available however to relate 'the 
individual searchers and their backgrounds, to the searches analyzed 
during this study. 

( ■ . 

4. Extent of Pre-Planning Before .Searching 

Almost gll of the installations followed the practice of doing 
some preparatory work before searching at the terminal. This is clearly 
seea by most installations as a practice which can result in more effective 
use of terminal time. Some installations insist on this approach as part 
bf their operating policy and procedures. One of thie test installations 
that did not follow this practice did have a relatively low search rate. 

If this practice is followed too closely in the quest for increased 
on-line prdductiviTyT--«M^ Uttle dis work at the terminal, it 

is possible that the whole character of on-line searching can be changed 
from an interactive dialog to a remote-job entry situation. This would 
be unfortunate because it would deny us some of the important advantages 
of interactive searching. There is clearly a- tradeoff between additional ^ 
preparatory time and time spent on-line. A rational approach to pro- 
ductivity enhancement will try to minimize total cost. 

5. Fee Versus Free Service, and Cost-Conscious Attitudes 

The^ cost-conscious attitude of the searchers or their institution 
seemed to have an important influence on the terminal productivity. 
Searchers ^who were operating in an environment in which the searching 
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costs were fully 'subsidized and were' perceived as "free" by the searcher, 
used the terminal in a' different manner than thdse searchers who were 
operating in a cost'^recovery 6r full charging mode. The searcher who 
yisualitesr a taxi cab meter mounted on the side of the terminal and 
ticking off dollars to correspond to terminal time, is much more anxious 
to get the search .completed as soon as possible. This attitude has been 
confirmed in many discussions with searchers and installation managers, 
both for the installations ia this study and elsewhere. 

Table 17, restructuring the data from Table 12, provides a summary 
of the performance data for the installations included in this 
study, the charging services run about 1.3 times more commands per 
hour through their . terminals than the free services-do (averages of 
113.3 commands per hour versus 86.2 commands per hour) and also run 
about 1.7 times more questions per hour through their terminals than 
the free services do (averages of 3.4 questiotis per hour for the 
charging services versus 3.2 questions per hour for the free services) • 
This supports the notion that the cost-conscious installations are 
more productive searchers; however, this dacc^ is clouded by the fact 
that all of the charging installations have l^j^h speed terminal equip- 
ment, consequently do not know what contri^> ition to terminal pro^ ^ 
ductivity is made by these two separate factor . ^ 

6 . User Versus Intermediary Searching, and tent of PSer^^volvement 

At the 1974 AS IS annual meeting, Dave McCarn gave a paper which 
described some of the experiences with MEDLINE searching. In that paper 
he noted that 73% of the MEDLINE searches were run without the user 
being present, even though it was his experience that on-line searches 
took slightly less time to perform when the user was present during 
the search operation and particfipated in the search process. This result 
is contrary to the experience of some other searchers. No data was 
collected during this project to test this hypothesis> however it is 
mentioned as another possible factor, that might? influence terminal 
productivity. 

7. Availability and Use of Analyst Support Tools 

Search efficiencies could be influenced by the extent to which 
analyst support 'tools '(e .g. , thesauri, term frequency lists, operating 
manuals, other authority lists) were* available and used by the searchers. 
We do know that most of the installations had the more important tools, 
but we do not know the extent to which they were used. 
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'VII. GUIDELINES FOR SEARCHIK6 THE ERIC FILES USING DIALOG 



A. INTRODUCTION 

In attempting to develop^ aomi guidelines for on--line searching of 
the ERIC data base with DIALOG, consideration was given to the following 
ar eas: _ . J • / " " > ' . - 

Pre-Searchln tlvlty (general considerations, procedures, 

decisions) 

, ' f> ■ . • 

~ Terminals Activity (recommended keyboard procedures) 

'S 

— Search Strategy (number of terms needed to adequately ekjiress 

each facet or concept of a multi-facet search; methods of limiting 
. ^ quantity of ovitput: effect ^n relevance)* * 

Each of these ateas Is "-reated below as a separate section. Topics; 
In the last two sections were suggested by an informal paper by Charles Mlssar 
of the National Institute o£^ducatlon,^and in these sections we look Into 
the quantitative aspects of attempts to Increase recall, on the one hand, and 
to llmit^ quantity of out)put, on the other hand. 

These guidelines are written to Incorporate inforoiation from many 
sources, including the findings of this project, comments and suggestions^ 
made by search analysts and terminal operators from many search facilities, 
and comment^ made at recent ERIC users meetings. 

This cltapter is not intended to serve as an introduction to DIALOG, 
or to the ERlfc data base* For those topics the reader is referred to 
Lockheed's terminal Users Reference Manual s ^Interchange >^and Lockheed's 
DIALOG ChroboXog .^ This chapter is directed^^pecjf Ically to the use of the 
ERIC data base implemented in DIALOG, and is not necessarily generallzable 
to other data bases or other search systems* 

Many of the following recommendations are routine practice for many 
existing ERIC/DIALOG installations/ and even for some other on-line search 
systems. However, we review them here for completeness and for the benefit 
of new terminal tisers. 

Our frame of reference Is that at present, nost DIALOG searchers are 
acting as Intermediaries — Interpreting and acting upon requests received 
from requestors by mall, by telephone. In person, or through further inter- 
mediaries In the field. It is expected that DIALOG searching will continue 
to be done primarily by trained intermediaries. Some of the points we shall 
discuss will, however, also be applicable to the work of a requestor 
searching directly without using an intermediary. 
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The intenaediaries will generally be operating in one of two environ- 
ments: an information retr|eval and dissemination center . (where the work 
of . the center is. mainly devoted to processitig search requests); or a library 
(where on-line searching is but one of a wide spectrum of reference services 
provided). At the time of this study jmost of the installations searching, 
the BRIG fiies were of the former type. Xn the future it is quite likely 
that more libraries will of fer Qn=llne searching services as one part of 
their regular reference services* and more terminals will be installed in 
offices and departments to provide direct service to end-users.. 



B. PRE'SEARCH ACTIVITY 

Given the above environments, this section will discuss some of the 
decisions which are made (or should be made) by searchers., consciously or 
unconsciously, during the pre-search period, i.e. before going to the 
terminals " - 

* " . ♦ " * 

1, Decision: Whether to Go On-Line ' ' - 

The searcher-intermediary shpuld consider whether a particular iiuestlon 
could be handled a9. well-manually .as on-line. (Note: bktch searching is a 
'separate Idsue' that la out of the scope of this study.) The major reason for 
'Considering this question Is that In some circumstances a manuail" search may 
be more cjbst effective than an on-line sejarch. Furthermore, the reiuested . 
material 4ay be' out of scope of. the ERIC file; the moral of this Is, "Don't 
do an on-line search for. things that are not in the file." . 

The environment may affect the decision: a librarian with the printed 
indexes conveniently at hand "might opt for a manual search in some cases; an 
information center staff member mi^t receive only pre-screened questions 
which had already passed this decision point; a person with ready access to 
a terminal but without 6asy access to the printed Indexes might prefer , the 
on-line search In aiiy event; a, searcher vith no budget restrictions* might prefer 
in all cases to do an on-line search. / 

To understand the alternatives, consider that there arjs no multi-year 
cumulations of the ERIC indexes for some of the search access points. In 
manually searching the printed ERIC indexes, a searcher must consult annual 
indexed for each past year of interest, and 'semiannual- or quarterly indexes 
for the current year, plus the indexes in each issue of the current year not 
yet cumulated. All of this must be done separately for the RIE and CUE 
series of publications. To do a comprehensive single-author search of the 
printed ERIC indexes as of this report date would require over 10 minutes of 
manual lookup effort in 25 separate volumes (12 RIE volumes: annual indexes 
from 1967, plus supplemental issues/25 CtJE volumes: annual indexes from 
1969, plus supplemental issues). On the other hand, the DIALOG on-line 
search provides access to the combined RIE and CUE files, back to their 
Inception in 1966/7 and 1969, respectively. The RIE and CUE flies are now 
updated monthly, k single term sedrch for the comblnea RIE/CIJE file would 
typically take about two minutes or less of on-line time, especially if a 
fast search process were used (e.g., BEGIN BYPASS, SELECT term, PRINT). 

The only general guideline proposed here is that manual searches 
should be seriously considered for some types of simple searches, particular- 
ly if the installation Is very conscious of the costs of on-line service. 
However, the exact response ^tlso depends on what type of simple search is 
required* For example, for single term searches : — 

. personal au£hor search. It is probably faster and more cost- 
effective to search on-line than to manually search through at 
least 12 separate printed RIE indexes or 25 separate CUE indexes 
— particularly over Ipng time periods. \ 



. corporate author search ♦ It may be a toss-up. Because the on- 
\' line display of corporate ault:hor entries is limited to two 24 

character lines, there nja/^be some ambiguity in the displayed 
items (e.g., as-^fo^ the several University of California entries 
shown in Figure 17) that will requite more on-line time for ; 
citation displays or printouts in order to search the desired 
institution. In the case of long or complex corporate authtfr 
naine^, it might be better to do a manual 'search. Figure 17 
provides some examples of tli« different forms of entry of iden- 
' tiflers aud "publicati6 v s-^a^^e" entries, and the effect t)f 

truncated index entries (as presented by the EXPAND^ command) , 
^ ■ ; on legibility of corporate author entries. The different forms 
reflect an area i\n which *^the ERIC processing centers have not 
\in the past exerted rigid authority control. The truncation by the 
EXPAND command is a system feature v;|iich is ap inconvenience in 
this regard, and hopefully could be 
source index. Institutional Sources, 
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Improved upon. The printed 
Statistics and Postings , will 



provide the accession number as showtti in Figure 18 for reports 
associated with the names of organizations which prepared docu- 
ments (institutional Source) or which, sponsored the work (Sponsor- 
ing Agency) covered In the RIE data b^se. It Yully cumulated 
annually arid can result in a fast maniial search^ although yielding 
only accession numbers. The full te3tt\ indexing of corporate source 
entries ^111 provide v^ome on-line advantage here when it becomes 
available* ■ , / . . . ■ • 

• . » 

subject search . In some instances it might be better to do this 
manually. A cumulative printed index of Descriptors and Iden- 
tifiers is available ^as shox^ in Figure 19. It gives an ED or EJ 
number for all of the' items indexed by that term through April 
19 73.^^ For some searches^ such as these^ that :do not . require a 
search of the most recent material, this* might be entirely . adequate. 
However, no abstract or citation is printed by this type of search. 

. title search . Title s^earches can presently only be done manually , *^ 
using the printed Title Index^ which is fully cumulated annually and 
provide^ title access to the entire RIE report collection through 
an alphabetic listing of all RIE titles. The DIALOG system present- 
ly does not provide a title word search capability for 'the ERIC data 
base, however, it is scheduled to be^available in January 1975. 

. number search > Searches of the RIE data base by report numbers, 
project numbers, contract numbers, and grant numbers can be done 
very quickly with the printed ERIC tools. Report /Project Number 
^ndext ^Contract/Grant Number Index , ^and Clear lngh6use Number To ED 
Number Cn ^s Reference List. 9 These publications are cumulated 
through December 1972. All of these files can be searched on-line. 

Because some of the single-term searches will yield a large number of 
retrieved citations^ the manual searcher may still be faced With an output 
task of locating and copying the citations and abstracts from the monthly 
issues of RIE or CUE. One alternative to consider here is to search the 
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term in the printed tools in order to determine the number of retrieved 
. citations. For more than very few citations it ^ould clearly be easier 
and less expensive over all to^ search and pr$,nt the citations and abstracts 
with the computer approach* 

. On-line searches are moiat appropriate for multi-term or ouiltiHaspect 
searches when an intersection of two or more terms or groups of terms is re- 
quired to answer a question. M Intersection is defined here as the combina- 
tion of two or more terms using Boolean ANU logic. - 

The relative merits of on-line vs . manual searching .are less' easily 
seen for q\iestioi[is wher^ a fev terms in a simple OR relationship are required, 
^ften such a search o6uld be carrii^d out rather easily, though not as. quickly, 
u^ng conventional printed indexes! 

Given the t> resent DIALOG system, ERIti data-base, and printed ERIC 
indexes, there are several points in favor of doing an on-line $earch instead 
of a manual search in the printed ERIC indexes r / 

— The search is done in one operation, rather thai^ having to be repeated 
over many printed index volumes. 

Both major descriptors (those marked with asterisks/ on the 
coiiq>uter printout copy and in the printed Indexes) ^as well as 
minor descriptors (unmarked descriptors, which are/ in the machine 
file but omitted from the printed indexes), may be searched on- 
line. This means that in cases where a rjsquester/ desires to see 
all citations vhich have been indexed by a spiedf ic terrn^ a com- . 
puter search would be appropriate; in facV this search could not 
be done with the printed indexes. A more detailed discussion of 
the major/minor descriptor values used in^RIC indexing is given 
in the later section on Methods of Limiting Quantity of Output. 

— Identifiers, which do not appears in the printed indexes, but are 
contained in the machine file, ma^ be searched. Identifiers are 
often used when a term is new and has not yet graduated to descrip- 
tor status. « 

— In cases where the printed index4s are not readily available; on- 
line searching will probably be/more convenient. 

/ 

— Title word searching (if and wlien added) will be an on-line capa- 
^ ] -biiity with uo manual equivalent. 

— Stem searching will be an on-line convenience when searching some 
>^ terms (e.g., computation, computational, computed, computer, 

^ ' computer-, computerized, computers, computing). 

— After identifying the relevant ED or EJ numbers, a computer-printed 
bibliography can be obtained fasterx-more conveniently, and at sig- 
nificantly less cost than the alternative manual process of locating 

* each citation in the appropriate RIE or CUE monthly volume and then 
copying the selected citations and abstracts. This output effort can 
be a significant factor when the typical search results in 50-100 
^ cit^tions^''^ O 

N 103 
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2. Declsloht Whether to Use Printed ^i^alyst SUpgort Tools Before Going to 
the Terminal \ ' - , 



Most terminal installations nqw ^rgue that searchers should do some 
planning and analysis worK on mo,st seWches before they go to the terminal. 
This includes at least the preldminaty identification of the major facets to 
be searched, the logical relationshlip| between these facet^, and some initial 
search terms. Some installations cdttlider it essential to t!we some sort of . 
form sheet to work up the search specifics prior to searching. During this 
pre-search activity, the analyst may benefit from one or.^reVof the analyst 
siu>port tools discussed below. 

a. The ERIC Thesaurus 

The ERIC system performs subject indexing of incoming itemsk in con- 
junction with a controlled vocabulary that was developed at the beginning of 
the ERIC system, and has been carefully and closely maintained sincevthen. 



This indexing vocabulary of over '7^500 terms is published as the ERI6, . 
Thesaurus, and re-issued in an updated form annually.; A sample pagejfrdm 
that Thesaurus is shown in Figure 20. In the ERIC system, all of tljer \ 
subject index terms listed' la the Thesaurus are defined to be^^ctiptoi 
^ and that terminology and distinction i^ ^jsed in this report, ^^her un- 
controlled subject index terms may ilwribe assignied to each^incoming itemV 
particularly for specific names (e.g., Bronx Zoo, B6700^ Captain Kangaroo)^ 
\ or terms that are not likely to result in enough pos^rings to make it worth- 
while to include in the Thesaurus (e.g., caper, caroiac, cats) • In the ERIC, 
system, these terms, over 22^000 of them, are d«ined to be Identifiers. It 
is possible that the same term might be used^n some earlier items as an 
Identifier, and in a later item as a De^cfiptor. There is an average of 10.46 
Descriptors per RIE accession, and 6. ftS Descriptors per CUE accession. There 
is an average of 1.75 Identifiers p6r RIE accession, and 1.37 Identifiers pet 
CUE accession. 

Searchers should consider whether, and how much, they should use the 
ERIC Thesaurus l^efore going to the terminal, since the thesaurus is also 
availablesfor ori-line display and may be used efficiently there. The 
searcher may choose between the following alternatives: 

1) Using the printed Thesaurus before going to the termix!|al, and not 
using the Thesaurus on-line. (This may be cumbersome;) 

^ 2) Using the on-line Thesaurus with no use of the printed Thesaurus. 

(This may stiff ice for experienced searchers.) 7 

3) Using the printed Thesaurus to sketch out the proposed search, and 
then the on-^line Thesaurus for convenient selection of terms. 
(This may work well for less experienced searchers.) 

4) Not using the Thesaurus at all. (This is not advisable.) 
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BT language Arts | 
. Literacy ^ 
RT Braille 

Character Recognition 
Cloze Procedure 
Context Clues 
Otacfitical Marking 
Initial Teaching Alphabet 
Inner Speech (Subvocal) 
Pacing 

Pattern Recognition 
Reading Ability 
Reading Achievement 
Reading Assignments 
Reading Centers 
Reading Clinics 
Reading Comprehension 
Reading Consultants 
Reading Development 
Reading Diagnosis 
Reading Difficulty 



Reading Failure 

Reading Games 
J Reading Habits 

Reading lmi>rovement 

Reading Instruction 

Reading interests 
^ Reading Level ^ 

Reading Materials 

Reading Processes 

Reading Programs 

Reading 1?eadiness v; 

Reading Readiness Tests 

Reading Research 
/Reading Skills 

Reading Speed 

Reading Tests 

Retarded Readers 

Sequential Reading Programs 

Te)(Bgraphic Materials 

Vocabulary 

READINGA^LITV^ 440 

NT Reading Skills 

Reading Sp^ 
BT Uanguage Ability 
^RT Cloze Procedure 

Informal Reading Inventory 

Reading 

Reading Ai:hievement 
Reading Comprehension 
Reading Development 
Reading Diagnosis 
Redding Level 

READING ACHIEVEMENT 440 

UF Reading Gain 
BT Achievement 
RT Academic Achievement 

Early Reading 

Reading 

Reading Ability 

Reading Development 

Reading Level 

Reading Skills 

READING ASSIGNMENTS 440 

BT Assignments 
RT Reading 

READING CENTERS 210 

BT Educational Facilities 
RT Reading 

Remedial Reading 

READING CUNICS 210 
NT Remedial Reading Clinics 
BT Clinics \ 
RT Reading 

READING COMPREHENSION 440 

BT Comprehension 

Reading Skills 
RT Cloze Procedure 

Content Reading ' 

Context Clues 

Factual Reading 

Informal Reading Inventory 

Literary Discrimination 
^ Readability 

Reading 

Reading Ability 

Reading Development 

Reading Skills 

Word Recognition 

READING CONSULTANTS 380 

BT Consultants 
RT . Reading 



fEADING DEVELOPMENT 130 

BT Language Development ^ 
RT Adult Reading Programs 

Basic .Reading 

Directed Reading Activity 
* f Factual Reading - 
^Readability 

Reading 

Reading Ability 

Reading Achievement 

Reading Comprehension 

Reeling Habits 

Reading Processi^ 

Reading Skills 

Reading Speed 

Vocabulary Devetopment 

READING DIAGNOSIS 440 

BT Educational Diagnosis 
RT Ettolo^ 

Reading 

Reading Ability 

Reading Tests 

READING DIFFICULTY 440 

UF Reading Disability 
^ BT' Language Handicaps 
* RT Dyslexia . 

Learning Disabilities 

Reading 

Reading Failure - ^ 

Reading Ofisability 

USE READING DIFFICULTY . 

ReacSng Enjoyment 

USE LITERATURE APPRECIATI9N 

READING FAILUikE 440 

BT Academic Failure < 
RT Reading 

Reading Difficulty 

' Read^g Gain 

USE READING ACHIEVEMENT 

READING GAMES 510 

BT Educatk)nal Games 
RJ Reading , 

Reading Instruction 

Reading Materials 

READING HABITS 440 

BT Behavior P^tten^s 
RT Habit Fprntationr o 

Language Development 

Reading 

Reading Devekypment 
Reading Skills/ 
Study Ka^its^ / I 

READING iMPRCniEMENT 440 

BT Improvement 
RT Reading / 

READING INSTRUCTION 270 

UF Teaching fifeading 
NT Language/Experience Approach 
BT Languag^ Instruction 
RT Adult Reading Programs 
Braille 

Content Reading 
Directed Reading Activity 
Early Reading 
Experience Charts 
Individualized Reading 
Initial Teaching Alphabet 
Kinesthetic Methods 
Large Type Materials 



Fig. 20. Sample jE^aae, from ERIC Thesaurus 



Unless the searchers are very experienced and familiar with the 
subject matter of the particular question at hand, some initial ^use of the 
printed thesaurus is advisable « The search "should be sketched our: in advance 
of terminal use^^ showing the facets which ate to be developed, and delineating 
the logical relationship between facets. (By facet we mean a term; or group 
of terms which esepresses one aspect of a search topic. Typical ERIC search 
facets would be age/grade level (e.g., high school, secondary school), subject 
field (e.g., science! mathematics) , and approach (e.g., audiovisual Instruction)) 
Each facet can usually be expressed by several roughly equivalent terms which 
are ORed together; twp or three facets are typically ANDed together, forming 
an Intersected get.) In our opinion, no more than a sketch Is needed at this 
point; it would be cumbersome to write out great lists of terms by hand* But 
the skeleton set of terms provides the^ searcher at the terminal with: 

— Starting points for use of the EXPAND command; ^ 

~ Memory jogs in case some desired terms do 5iot show up iti the 
Thesaurus as "related terms" 4urlng the course of the search; , 

Advance planning time for handling terms which need special 
treatment ^ e.g.y terms whlcli themselves include nK^re than one 
facet of the planned search ("secondary school science" as 
oppdse^ tQ "science iris tructlon". and "secondary schools.." etc.). 

Information regarding which terms shou^d^perhaps not be keyed on- 
line for a descriptor search (because the term is absent from the 
Thesaurus). Thesaurus "Use For" terms are a good example of thl«. 
The Use For relationship indicates that one term should be uSf d^r* 
for indexing or searching instead of another. The Use Far terms^* 
are given in the printed Thesaurus but are not given in the on- 
line display. 

b. Term Frequency Li^ts 

A helpful tool for identifying situations in which there may be many 
postings for a given term, is a cumulative term frequency list which shows 
how many file items have beeii indexed by each term used in the ERIC system. 
Such a listing' can also identify terms that might have been considered for 
searching, but should not be keyed in on-line because no items (as of the 
date of the list) are indexed by that term. Term frequency lists for the 
ERI,C data base have been prepared by several organizations, and can generally 
be obtained at very little cost from the originating organization. A brief 
description of several of these lists is given in Table 18. Sample pages 
from two of these lists are given in Figures 19 and 21. 

The Lockheed lisi^glves frequencies of both Descriptxirs and Identifiers, 
merged together in one alphabetical listing just as they ate .displayed on- 
line by the EXPAJJD command. Tne Macmlllan report^gives . term frequencies as 
well as ED or EJ numbers of items indexed by thos^ terms ^ but divides the 
••eport into Descriptor and Identifier sections. The North Carolina report 
gives term frequencies only. ^2 - 
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TERM FREQUENCY LISTS FOR THE. ERIC DATA ^ BASE 



' Sourcei 



Lpckheefi; ■ .--J Aug. ^ 1971+ X (meifged Ust) 



Macmillun 
Informiition 



\ 



1. 



■. / 



DATA INCLUDED 



Date of 
Cumulation 



Descrip t c^s 



Posting . Accession 
Identifiers Frequency Number ^ 



Apr., 1973 



X (separate lists 

. for Dsscriptors 
• ;-ah<i Identifiers » 
diivided by 
source — RTtf: or 
' CUE) 



7 - 



North Carolina Sept. , 197^ 
Board of Sciena^' 
and technology 



X^-{'tierged list ,of • 
Descriptors and 
Identifiers ♦ 
divided by 
source— RIE or 
CUE) 




ERIC 



c. other Printed Aids 



Another useful printed putillcatlon which should be mentioned Is the 
ERIC Ptocesslng Manual , which contains Infornatlon about ERIC Indexing and 
other characteristics of the data base, 

3. Decision! vniether to Uae SEARCH' SAVE 

The SEARCH SAVE feature, which is provided by the DIALOG system for 
the ERIC flies, enables searchers to store search statements for later 
execution with the same^or another eearch request. This feature provide^ 
an easy and time-saving way to handle commonly-^recurring sjearch facets (e.g.» 
a school grade level) instead of reconstructing them esu;^ Itlme they are needed 
Some facets such as elementary/siecondary^'education might X«,qylre 35 or more 
terms for a complete descrlptionf it would be a terrible inct»nvenience to have 
to re-key those terms every time that facet was tid^^ & search. 

* . ■ . , ■■" \ ■ ' ' ■ : 

An example of such a saved facet is shown in Fi^re 22.. The SEARCH 
SAVE file may be thought of as dnaiogous. to a collection of computer sub- 
programs which can he called up by a programmer when needed. A given in- 
stallatiott taay wish to cceate its own library of SEARCjl SAVES for its own 
subject areas or repeating concepts. . An installation nay also use already 
existing ones, by consultlng^'^gie list of SEARCH SAVES published by Lockheed,^* 
and illustrated earlier in FlS^.e 22. As a side comment here, the usefulness 
of the SEARCH SAVE list would be enhanced by a title index, and perhaps a 
keyword index. The SEARCH SAVE feature is Intended to he used as a basis for 
current awareness searching (i.e., SDI) for a given profile. At the time of 
this report such currcjnt awareness searching was Impleiiented only on the 
Predicasts data base, and no date had been announced yet regarding its use 
with the ERIC data base. 

Before going to the terminal, the searcher should note the file number 
of any SEARCH SAVE to be used. If ho existing SEARCH SAVE is exactly right, 
but on^ is neaded, the searcher should plan to qreate the SEARCH SAVE as a 
separate step. 

SEARCH sAVE Is stored when the command END/ SAVE, or -/SAVE is ls«ued. 
.At this \tlme the DIALOG system responds, on the terminal, with a 2-character 
number, such as 6G. The searcher must record this nusiber, by keeping the 
terminal's printout or by writing the number down, in order to be able to 
later RECALL the search. Unfortunately tjie number is not printed on the 
search hlitory which accompanies any off-line printed citations. It would 
be helpful\ if Lockheed would Incorporate the nuinber of any newly-created 
SEARCH SAvi into the off-line printed search history. j 

A sayed search may be recalled and used by giving a .RECALL hn 
command, where nn is the previously Issued SEARCH SAVE number, followed 
by the command .EXECUTE (n) . The descriptor postings are newly derived for 
the sets specified. The saved search executes to the end, or executes the 
set nustber specif led, , with all its previously defined component sets. 
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USER .SER« DATE NAME TITLE 

# ■ . • • • ^. 

7 N l2/f3y^72 CLAY SEARCH SAVE/HIGH SCHOOLS 

SET COMA*ID 

. SEARCH SAVE/HIGH SCHOOLS 
CLAY 
SMEI^C 
SAN WATPO 
I 

1 iHtGH SCHOOLS 

2 iSENIORv HIGH SCHOOLS 
3'«S«C0NDAftY GRADES 

4 fSECONOARV SCK30LS 

5 fSFCONDART SCHOOL STUDEKtS 

6 «GRADE> 

7 iGRAOE/ia 

8 fGRADE 11 

9 iGAADE! 12 ^ 

10 iHIGH SCHOOL STUDENTS ^ 

11 «H I GH SCHOOL CURRICULUM 

12 fHlGH SCHOOL ROLE 

13 tSECONOARY EDUCATION 
U $l-i3/* 

•/SAVE 



Fig. 22a. Example of a Saved Search 



6 



SET ITEMS DESCRIPTION 
25 11863 SERIAL NO.t N/ 



Fig. 22b. Example of Message Reporting Execution of 
Search Save N, Shown Above 



Uhen a saved search Is executed, only Its number Is reported on the 
search history, as shown in Figure 22. A much more Intelligible search 
history would result 1£ the tltl,e of the' saved aiid .recalled search were 
given as well. - 

'* ■ J*. . » ■ 

the SEARCH SAVE feature was announced in Fall 197^^' As shown by 
^the BECALL and EXECUTE conmand use data in Table 6, it does not. appear to 
havci been used very extensi^i^ely by many^ teminal installations, and may not 
^ve contributed significantly to the pirformance of thej^-^tallations during 
the 15-day period that we examined closely. < v.^ 

4. Advance Determination of Possible Ways to Limit Output 

It should be ascertained ^advance (while discussing the search topic 
with the requestor, if possible). Whether a broad or narrow search is desired 
by the requestor; how many citations are desired (or expected); and whether 
a limitation by date or other criteria would be acceptable If too many cita- 
tions are retrieved. How many citations are "too many" varies with the 
Individual; most installations have a working as sump tldn that a number of 
citations from 50 to 100 is appropriate, and more^than this number is too 
many. A few installations feel that most of th^lr users do not needor want^ 
more than 5»-10 citations. ^'t^ 

i 

Many different criteria can be used with the ERIC file as a basis fbr 
limiting the output on something other than a subject basis. Examples of ' 
limits that can be used are: / 

. date (of publication, of ERIC accession) 

. contributing ERIC clearinghouse (e.g., EC, IR) ' 
. ED versus EJ publication (ED only, EJ only) 

I 

. type of pyblication (state-of-the-art review, annotated 
bibliography) ^ ^ ^ 

. availability of the cited publication through the ERIC Document 
Reproduction System 

. total number of citations to be printed. 

These parameters are often built into search request forms. Sometimes 
additional limiting facets can be specified^ e.g., "curriculum work only", 
"evaluations only" . A written statement of the search request should be 
obtained whenever possible; such written statements often provide clues 
which can be helpful if the search does not proceed as expected at the 
terminal. 
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C. TERMINAL ACTIVITY ^ 



I. Equlpiaent Considerations 



ERIC/DIALOG may be accessed using a number of different equipment 
conflguratloiis (e^g.^ hlgh-^speed or low-speed terminals ^ high speed dedicated 
phone lines ^or lower speed dial-up phone lines; CRT (cathode ray tube) or 
hardi*copy terminals or combinations of thlese) , These considerations were 
discussed in laore detail in an earlier section of this report^ During this 
.study we us^ed primarily a lilgh speed (480 characters /second) leased line 
and CRT terminal^ with an auxiliary hard-copy printer; we also used a slow 
speed dial-up hard-»copy terminal. Both types of conf igurati'on performed 
satisfactorily for us « ^ 

- In considering whether or not to -use a CRT-only terminal, a hard- 

copy-only terminait or a CRT te^iilnal supplemented by hard-copy printout of. 
selected pieces of information, the following points should be considered^ 

Hard ei>py output of the terminal has several advantages: 

— Useful in tracing and recording previous steps in search execution 
(This may be^done on a CRT by the DISPLAY SEARCH HISTORY command.) 

~,ean be used for direct printing of retrieved citations at the 
terminal 

— Provides a printed record of file numbers of saved searches (see 
previous section on using the SEARCH SAVE) « 

' — Provides an immediate printed record of the elapsed search time 
that can be used for charging and fcost accounting purposes for 
those installations that recover costs by service charges.* 

On a configuration that has both a high-speed CRT and an auxiliary printer, 
the printer is usually used only to print DIALOG commands and desired cita- 
tions, thus reducing the volume of terminal printing activity • On hard- 
copy-only terminals, all DIALOG responses arc printed out; this may consume 
a i£Ginsiderable amount of paper (and make a considerable amount of noise if 
utechanical printers are used), especilally i|f the EXPAND command is used. 

Disadvantages of hard-copy-only terminals are: 

— Uses a lot of paper, especially If 2XPAND commands are used; 
hence the use of this command might tend to be discouraged 

— Slower speed than CRT terminals (in .characters per minute) for 
most types of hard copy teniiinals 

~ May be noisier than the other alternatives 

— If a high speed terminal is desired (e.g., 480 characters /sec), 
it is generally more expensive (for both equipment and supplies) 
to use high speed printing equipment than CRT equipment. 
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Becatise Lockheed's response time in delivering off-line printed citations is 
so fast (citations are printed off-l^ne in the early shift of the momi'ng 
following Lockheed's receipt of a PRINT conanand, and sometime the same 
day and then sent ktx Mail), the tiie advantage gained by printing' Citations 
at the terminal is slight (only a f^w days); however, having a paper copy can 
be useful for recordkeeping and reference purposes. 



We favor a CRT terminal installation that includes a hard-copy feature, 
but.realiz» that this may not be cost effectlv!^ for some other installations. 
The search speed can be liiq[>roved through the use of higlv-speed coianunlcations 
and display equipment, and this should be considered for installation? which 
do a large volume of searching. Test data reported it& earlier sections of 
'this report showed that a significantly larger volume of work (questions or . 
searches per hour, commands ^er hour) wis passed ^through the high speed ter-. 
tdnals for the same unit of time. The high speed terminals can be cost "-v^ ' 
effective at moderate volumes of search activity and provide considerable 
cost sayings at.hiih volumes and can be Justified from only their fixed 



In the previous section we discussed some aspects of searph negotiation 
jftnd preparation which may take place before the searcher goes to the terminal. 
In this section we will discuss the procedures which may be followed by the 
searcher at rAe terminai'<f.. j^^t is assumed that the installation will have good 
sign on and^i-gn off procl^ures to avoid the charges for terminal time while 
the searcher takes a bjeak,.^ interrupted for any significant period of time, 
or walks away from the ^j^inal and forgets to disconnect the terminal from , 
the system. ; a 

in most installations undir study here there are probably ^ome tacit 
assumptions about \he frame of reference of the searches being dori^. One 
assumption which has a direct effect on activity at the terminal relates to 
requestor involvement J Is the requestor .assumed trt be interested in, and 
capable of understanding the search logic which pniduces the list of cita- 
tions? Or is the requestor asisumed to be interested only in the output, 
and not at all in the process? Is the search protess iterative with respect 
to a given request? Or is iteration limited to tirocessing new search topics 
fdr a given requestor? The project team assume^ that most searches are 
"one-shot" efforts, not expected to be wised /or re-run. However, iteration 
may occur when the requestor loeeds an update ^ the search. 

In the long run, it seems likely that a repeat customer will be one 
who has derived a measure of satisfaction from the retrieved material. This 
satisfaction may well be influenced by understanding the search process, 
thus prompting the requestor to participate in future iterations of the search 
process. We feel that requestor involvement Is important, and that it can be 
encouraged partly by an understandable search history printout, which provides 
the means for evaluating the usefulness of terms that caused cltatlonR to be 
retrieved. The printed search hlstoify can serve as a very useful focal point 
for discussions between the requestor and the searcher about the search, re- 
sults. The experi^ce of interpreting a search history in conjunction with 
its output should te helpful in the development of future searches. 




2. Recommended Keyboard Procedures 
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The following reconnendatlona at^ oatfe with two goals In ndnd: under- 
standablllty of* output to the requestor* and thr,C)ugh-put speed at the terml* 
nal (terminal productivity). We reconmend keyboard procedures which "^wlll 
provide the requestor with a "readable** printed search history, \and clearly'' 
Indicate the tenas and strategy which have produced th^ resulting citations. 
The recbnmended procedures are blso relatively fast, though not the fastest 
possible procedures. 

a* Initlallzatioh 

i ■ . ■ \ . ■ ■ ' . 

DIALOG provides an jinitializat ion routine that is "started by the BEGIJY 
command (1). This routine prompts the searcher to keyboard the title, 
searcher, requestor, and itta^iling address information; this information is 
then printed at the top of the search history which accompanies any citations 
printed off-line. It is also printed and displayed a^: the terminal. Initiali- 
zation results in a very useful and clearly identifiable search output record. 
However, the initialization routine presently requires a considerable amount 
of terminal time (an average of 3.0 minutes tq. initialize, according to the 
use statistics reported in an earlier section of this report). 

One alternative to using the full Initialization routine is to use 
BEGIN BYPASS. In this case the search history is not identified by requestor, 

searcher, or title. 

> 1 • • . . . 

Another alternative to lislng the full Initialization routine for each 
search Is to include several ''questions*' aftet one initialization. For this ' 
study we have used Locjkheed's dej^lnltlon of "search" and "question" that is 
described in an earlier section of this report » i.e., s search is bounded by 
a BEGIN coiamcmd and an END -* BE6IN» disconnect » or END ^ disconnect comblnar 
tiom Questions within a search are bounded by additional END conaands. Thus 
a BEGIN followed by searching commands followed by an END^ more searching 
conaandst an END and a new BEGIN» would be considered as one^ search with two 
qu^tions.^ This was discussed in somd detail in an earlier section of this 
report. When several questions chK^ Included a^ter one Inltlalizatlont the 
search strategy used for each question will be /included in the search history 
printed off-line » but the citations printed off-line will ustially correspond 
only t9..tlie latest section of the search history (the portion since the 
previous ^)CL coonand)/ Because the requestor's terms may be miiced in the 
sequence of searcher actions » the requestors will probably not be able to 
easily Interpret the search history^ if indeed the'y see it at all* However^ 
terminal time .may be saved by grouping several questions into one search t if 
some concepts or terms are used in more than oq^ question. Such economies 
are probably most suitable for installations processing a large ntmber of 
search requests » for this approach requiries some experience on the part of 
the searchers. 

We recommend that each logically distinct search or question be ini- 
tialize^ separately.^ We feel that initialization provides quite' considerable 
advantages t for subsequent handling and understanding of the printed outputs 
It Wuld be very useful if a quicker version of the initialization routine 
w^re providedt which would minimize the time disadvantage. 
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Initialization may be considerably speeded up in the praenA system 
by "stacking" responses, using the semicolon, and providing the iniiializa- . 
tion information befor'e the questions are asked. By stacking we me&n sending 
several commands during the same transmission burst. This may be achieved 
singly by keying each command or resppnse to be sent, followed by i semicolon, 
another response and semicolon, and filling up to one line of display (62 
characters) before- pressing the INTERRUPT, RETURN, or CARRIAGE RETURN k6y. 
Figure 23 provides an example of both the regular, and the stacked method 
of inltializirfg. \ ^ , 

The staci^g of searcher responses shown in Figure 23, accomplishes 
the whole initlalization^with a minlmuin of terminal wait time. Sending 
n»re than one line bf dlsWay (62 cha'racters) at one time, however, results ^ 
in a/ truncation of the chakact'S^ string, and this can mean that semicolons 
to Send subsequent commandsXwillNi^ot be recognized. Stacking woij^d be. greatly , 
f^litated if the length ofXcominatkd string that could be sent :W2if% ext«»ndsd 
, (and perhaps made visible ^pn \he scr^n) . Initialization would be facilitated 
If a 'several-line block we& p^ovidedi without separate promptings, Jso that 
the outputs could be clearly idi^ntifled with a smaller penalty i%time, 

b> RelatJive Merits of EXPAND^SELECT Combination vs. SELECT Alone 

Although a search whlduJ-^ftalyzed with the printed Thesaurus Wd term 
frequency lists and^ii^^xsS'wt rather fully in Advance may not benefit by use 
of thefEmNB-xsr^nd, we, believe that the use of the EXPAND command Is usually 
preferable to the use of the- SELECT command by Itself. 

khe EXPAND command worKs on: two levels,. The command EXPAND READING, or . 
EREADlNip, results In a display-^of the alphabetlcaUy-'^earest Descriptors and 
Identifiers surrounding t^e charac^tferSi READING (see Figure 24), For each line 
,the disiiyLay provides a reference line niimber (E-number) on the left, the term, 
a postings figure on the>ight, andlifvthe term is in the Thesaurus, a figurti 
indicating the" number of terms related to it. A second EXPAND may be used, 
e.g., EXPAND E6, to view those related terms. The resulting display (see 
Eigure 25) is equivalent to thfe related terms (RT) listings under a given 
Descriptor in the pxrinted Thesaurus. The related terms In the Thesaurus 
displays have R-numberifc (Rl-Rn) as reference- numiiers. They may be further 
EXPANDed;!|this is equivalent to looiiing from one "^old face) Descriptor 
heading another in the printed Thesaurus. 

Advantages of using the EXPAND command are: _ 

,j - .■ • 

— ' Alphabetically-near Descriptors and Identifiers may be SELECTed 
quickly and easily* from the display, using the-^^-numbcr or a list 
of E-numbers which may include E-number ranges ^.g., SELECT EG^EIO); 

— Related terms from the Thesaurus may be SELECTed easily from the 
display, using the R-numbers as described above for E-numbera; 

— The EXPAND comiat^dHs not sensitive to typing mistakes; It generates 
a display surrounding whatever character string Is given. If the 
typing mistake is near the end of the character string It may still 
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REQUESTOR -r*i=HrnE 



You MA V flCCES* THE Fn-La»4IWo FlwE* ! 

d --ER.IC I9r£" £P> £S _ 
:-; --CHEM ABiT VDL 7o - 91 I S::: c 
4 -.-ERIC 1974 £ MO sliTR ---EC 

6 --MTIS 19S4-1974, ISSUE !?• 

7 --CDMTRfl CaSTR FILE 
• S — ENGR. IMDEX 7.=: -74 - JULY 

a --;:^IM & HPri 1973 

ir. --HftT.aSR.LlB'CaiH ISS: J.jr4E 74 

11 --PSY. RBSTR. 1967-1973 

12 --IMSPEC-PHYilCS ISS 15 
1"^ —IMSPEC -ELECT. E!S;5IMEE!?IMS ISS 15 . 

14 --xriSPEC.ycar^P.'JTEPS aNB cdntpgl iss i 

15 .--flBI B!#ai:B!=tSE 
IS --CMa-EMR 

17 --•r'XED U'EEK.LY 

18 --FO.S TEST 

TITLE ■ , TITLE-CF-SEiRRCH u V,..u 

DmE 'FILE r.-4 S-74 • 1 3)/AU=G. V>«ck 

SERPCHER MY -NAME rettxlU 
REQUESTOR REQ'JESTnR-M«ME ;v.l W., cX\<rn 

i^BDRESS SHnRT-#*DBPESS O 

SET ITEMS BESCRIPTION 



Fig. 23a. Regular Method of Initializing 
(Begin Cornmand Not Shovm) 



M*IH_ r*Z'V<»£=s 

i --HS'ic r'rTc -- ;£:-> e: 

$ •Hot ♦ ■' 

£. --MTi : l ?-4-l"-<r4. I " ".'.•r:. 1" 

—IuMTSh ■C1*1:=i file 
? INr.g;:: rc-"'4 j.LV 

■ ? --HIM •.■ .^i^M 

I i:i - -.mht .... I B ■: 1 ■ ' I : : -*-' ■•- r4 

II --f":"v. ;~vt:T;. i?:r-"-i-^~'' 

i": --i.'i:p£c -£le::t. ?..M>ir<r-E?ri'^ i:: 15 
14 --iM.-j-iE.r -':cm:-'Jt;?-: -^uC' zQ-'Tr^y. i:" 1 

it> --.-it-I I'mTm Eh:£ 

M-t --r;'.: T£:r 

TITLE TITLc-DF- DiAL-CC. ,-*v>'«lj; otitt-. /t-^wvi»&cw 

rv^TE -FILE ;:--l :- ."4 1 -^Ut '.A.AievU *Ai;(y»-> 

^:^ryjz. z Tap >f::'L'£ i to? -. ' -r-T 
^Drip's z : ^Li.' T - "tj r»f^'£ : : 



Fig. 23b. Stacked Methcxl of Initializing 




REF 


INDEX-TEPM TVPE 


• ^ y 

ITEMi. 


RT 


El 


IT^PERDIN'^ii CHKlST— 


1 




E2 


IT=REflr»INES? PGR 


1 






ITsRERDINES^ FDR 

vOCnTIunRL '^LHHNIN'j- — 


I 




£4 


it=REf»niNEiS '=np vac AT in 








NRL PLRNMI^^G ?CflLE 


1 




ES 


IT*RERl»INES2 ?.KILlS 


1 


E6 




16cO 


SI 


E7 


IT=RE«IiING ♦SPEECH— 


1 




E8 


IT=REflDIN'5 ABILITY 




11 


E9 


IT-REftDING flCCELERflT^R — 


I 




ElO 


ITsRERIiING RCHI€v'ENENT — 


I 094 


9 


Ell 


ITsRERDING !=l»iO ITUDY 
SKILLS LRBnR«=*TC!RY 


1 




El£ 


I T^RERD I N»5 ftSS I GNMEHT S — 


4=* 




El 3 


IT=c>ERDING CRRD SNELLEN 


1 




E14 


IT=PERI>IN6 CENTER 


I 




E15 


IT=RERDIMi5 CENJERS 


iia 


4 



-MORE- 



\ ■ If ■■ 

* t ■ 
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Fiq. 25. Sample EXPAND Display for Related Terms 
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result in a useful display, (The SELECT cboinand with a typing 
mistake in the term chosen will give back a zero-posting message 
and have to be repeated.) ^ 

— Less information needs to be keyed in; e.g., keying in the first 
eight or ten characters of a term will bring a display in the 
general alphabetic area desired; 

— The EXPAND results in a lookup in a combined file of Descriptors 
and Identifiers, thus saving a double lookup in the printed refer- 
ence tools; 

~- The on-llhe Indexes ate usually more up to date thaii the printed 
Thesaurus and other printed searching aids* 

A dlsadvanta^ Is: 

— If related tenos and alphabetical ly-near Identifiers are riot used^ 
the EXPAND-SELECT combination requires two commands to obtain one 
term. - 

Kelatlve Merits of SELECT and Straight Typing Vs. SELECT E- or 
Renumbers 



/ 



mien SELECTlng terms, 5 options are available. These are: 

Option 1 ; Terms may be SELECTed directly, by keyboarding the entire term: 
"e.g. 

\.. SHEADING PROGRAMS-Jllpt* 

SHEADING READINESS [INT ] 

Option 2 ; If an EXPAND has been used, E- or R-numbers can be SELECTed as 
follows: 

" ' " SRI [INT ] 

SR7 [INT ] ' . 

When doing EXPAND - SELECT operations, some time can be saved by 
stacking commands in the same manner as in the initialization routine. That 
is, after viewing an EXPANDed display, SELECT several of the displayed terms 
before keying the INTERRUPT command (e.g., SELECT E6;SELECT E8;SELECT ElO-^ 
E12 Unt] ) . Or call for the next EXi-AND command along with the last SELECT 
commaiJd~(e.g., SELECT E6;EXPAND ABSTRACTING [INT]). Remember that the stacked 




* [iNfl will be used to denote pressing the INTERRUPT, RETURN, CARRIAGE RETURN, 
olTother send key. S is the form of the SELECT command we preferred, since 
it is one character and does not require use of the shift key. Other alter- 
natives for the command are the // sign, or the full form, SELECT. 
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coomands must not exceed 62 chai:acte)riS;«'^.>.Qnjdlal^tip tiard <^l^ teraltials with 
an 80 character line (where th/- li^inthead Is positioned at tfi«tJ2l8t character), 
stacked coootands should all be contained on a single line of display before 
transnltting. On a CRT tenolnal the 62>character window may be wxe ,thfm oft^ 
line long, depending on the width of the display screen. . 

Options 3 and 4 are slaq^ly options 1 and 2, stacked. 

Option 3 ; Entire terms may be keyboarded and stacked, e.g. 

SHEADING PROGRAMS: SREADIHG READIKESS [INT 3 

With this option the- size pf the 62-character "window" for data 
entry usually precludes stacking more than two or three Descriptors. 
Also, if terminal errors occur, the effect of typing out full 
Descriptors, if they are lost by error, is noticed! 

Option .4 : After an EXPAND, several E- or R-numbers can b« SELECTed by stacking, 
e.g. 

SRI; SR3; SR9 frar ] 

Options 1 through 4 result in each SELECTed term being displayed on 
the search history. This provides a clear record of exactly which terms 
have been used. 

Option 5 : Another mode of SH-ECTlng Involves a group of E- or R-ntaabers 
separated by commas (or hyphens), e.^ 

SR1,R3-R5,R7 (m] 

With this option, the Invididual tei.:^ are not shown on the 
search history. The E- or R-numbers are shown, along with 
the reference point to which they relate, e.g. 

kl,R3-R5,R7 ♦ 

IT-READING i . 

(Note: If many terms ire selected, sometimes the reference point 
is truncated on the search history.) 

Option 5 has the effect of creating an automatic ORed set. It is most 
efficient in terms of time, as long as changes are not required liiter in the 
search. The automatically ORed set, however, normally contains items from 
only one screen display, and thus unless all desired terms' for the facet being 
developed are located on one display, a COMBINE conmand will still be needed 
to OR together the automatically ORed sets from several screen displays. 

If it is decided later in the search that a certain term should not be 
Included in the ORed set, the set will have to be created again, unless NOT 
logic is employed, with its attendant danger of excluding citations containing 
both acceptable and unacceptable Descriptors. 
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OfTtloti 5 Implies limited requestor involvement in the search strategy , 
! since the requestor cannot see» from a list of numbers anjd a reference point , 
' what terms are actually being used (unless a hard copy terminal was used and 

I ' tiiie requestor is supplied with the ^ full search record) « 

I" ■ ■ ■ ' . ■ 

I We lecommend option 4 as ihe usual approachy a method which is fast, 

but which is also fully reported on the search history* 

' . d* Timings of Five Alternative Procedures for SELECTing Descriptors * 

The 5 different methods of SELECTing Descriptors were timed, in order 
: to obtain some idea of the, time differences involved., Table 19 shows the ' 
1 results of this timing exercise # 

\:r ' ' , ■ ■ • ■ 

SELECT stra ight typing and SELECT R-ntmibers were compared, using either 
individual sends ( [INT ] after each command), stacking (semicolon after each 
coomiand) , or chaining (comma after each R-nnmber chosen) . For the SELECT R- 
numbers section, a display (^'EXPAND LIBRARIES**) was generated before times 
were counted, and an E^^0 cummand was issued so that only the time £or actually 
SELECTing would be measured* (The display of LIBRARIES and its related terms 
was not affected by the EMD comaandi) » We did not include the time used for 
EXPANDing to obtain the display, because the nuniber of EXPAND conoands pre- 
ceding any^o^;^ SELECT sequence could vary: one EXPAND would be needed to 
/create an alphaibetical E^nunber display froQ which items could be SELECTed; 
another EXPAND would be required to create a Thesaurus R-number display; a 
further EXPAND to look through the Thesaurus; perhaps a PACE command to view 
a second page of a display « From the LIBRARIES display, ten teribs were SELECTed 
— the second, fourth, sixth, eigjhth, etc«, related terms* After the ten had 
been selected^, by ilhichever method ^ an END command was issued, and the elapsed 
time reported by the system was noted* This process was repeated five times 
for each of the five modes of SELECTing* Results indicate that option 5, 
''chaining** (using commas with one select ^:ommand) is fastest, requiring an 
average of *34 minutes for ten non-adjacent^ terms from one screen* **S tacking** 
SELECT R-numbers (option 4) was next fastest, requiring an average of *74 
minutes for selecting the same ten non^adjacent terms* (This is the option 
preferred by the project team*) Third fastest was option 3, the stacked 
sending of straight typed terms, with an average of 1*61 minutes to select 
the same 10 terms* ^ 

Individual sends for SELECT R-nunibers (option 2) averaged 1*96 minutes 
for the 10 terms, while individual sends and straight typing (option 1) was 
the slowest method, averaging *2*23 minutes to SELECT the ten terms* 

Tne difference between the extremes in time is almost two minutes for 
ten terms; this is enough time to consider seriously for routine procedures* 
(However, the overhead time of EXPANDing the term LIBRARIES in order to create 
the Thesaurus display would lessen this difference slightly* On the high-»speed 
terminal at which this timing experiment was performed, the time required to 
create th^' display was minimal; however, on a slower terminal the time required 
to- create the displays could be noticeably higher* 
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RESULTS OP TIMING EXPERIMENT FOR SELECTING 10 DEST^RIPTORS 
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fi-A . » Table 19 shows that stacking con&ands results in definite time savings^ ^ 

-We irecoamend that stacking be used whenever possible. The difference Ip 4;i^ .f^ 
' - between options 2 andl 3 Illustrates that efficiency in sending conmAnds'Ku^ * 
Xpf stacking) has aorei^ifect on time used than efficiency in keyboarding ^ v 
(^cj^jjog^'only short It-nuDbers rather than full Descriptors). 

Betvheen the two fastest methods, option 4 (SELECT R-nuobers, ^stacked), 
and option 5 (SELECT R-numbers using conmaa, chainihg) there was an average 
(iLfference of oi^y one<»half minute for the ten terms. Ve believe that t3^e 
use of option 4 is worth the extra half minute, since the resulting search 
history, displaying each term SELECTed, will have increased readability. 

3, Strategy 

\r a. Nunber dj Terms Needed to Adequately Express Each Facet o£ a 
Multi-Fad^ Search '. ■ ' 

Due to thii\requeitt use o.f broad concepts in searching the ERIC files, 
and to the characteristics of th^ ^C indexing language, one of the questions 
<. fa^d by searchers atSt^e terminal is* the following: how many terms will ade- 
.quately express a given "^fa^t (or aspect, or concept) of a multi-facet search? 
^Hbi^ much effo/?t should be e^teq^ed in looking for possible Identifiers (ERIC*s 
free" indexing! terms^) , and for var^lmt forms of terms, such as plurals, mis- 
spellitUgs, and alternative punctuatiohs? 

i ^. • ■ • ' ■ ■ . ■ . 

The answer to thfese questions ie not obvious when one is dealing with 
intersected sets. Sharon Jewell^^ has given some approximations of the kinds 
of retrieval quantities one may expect from intersecting heavily- and lightlyr ^ 
posted terfltt. To d^yelop guidelines in terms of facets containing several 
terms, an attempt waia made in the present study to measure the effect on re- 
trieval of using varying nunibers of terms to express each facet of some real 
questions, with two or three facets per question. 

We measured the incremental effect (in number of output citations) of 
adding each additional term to the facets of 2- or 3-way intersected searches 
(searches Incorporating 2 or 3 sets combined with AND) . We chose Lo work from 
. the heavily-posted terms outward^ adding one term to each facet (unless the 
facet was a SEARCH SAVE) at: each increment, considered working backwards, 
by subtracting one term from each facet at each decrement, but rejected this 
nethod- because the facets had widely divergent numbers of terms, and it would 
have been difficult to determine at what point to decrement the smaller facet. 

As a source of real questions, search requests relevant to their personal 
[ Interests were solicited from Ph.D. students in the University of California, . 
Berkeley, School of Education, from ILR staff, and from some persons outside 
the University of California. The recipients of ^he searches agreed to make 
relevance Judgments of the output. 



Siearckes were negotiated during personal IntervlfKirs, except for one 
search which was negotliated a colleague of the requestor. The Thesaurus 
was used as a source of terns during these Interviews, which were held before . 
the searchers went to the terminal* High recall performance ratheir than pre- 
cision, was emphasized In the formulation of the question. 

At the' terminal, the searches were first run In an "eadiauaftlve** manner, 
trying to extract as many potentially relevant references as possible » In order 
to Identify the set of relevant citations In the file. Variant jtottBB (additional 
Descriptors V Identifiers, slngular^lural forms and mlsspelllng«) were SELECTed 
whenever appropriate. A total of ,14 searches, with a total nuBi^er of 364 terns, 
was performed In this way. .^' 

We atteinpted to use as many terms as possible to exprei^ each facet in 
order to find the point of diminishing returns: i.e., the pof nt j^twhiclx the 
addition of further terms did not result in new citations bejling re^llleyed. For 
the searches done, facets were represented by a range of 1 td 19 terns ^th an ^ 
average of 11.4 terms per facet. Each term was SELECTed separatel]^ in order to 
have it identified with its postings figure. This method m nebeslary for 
collecting the numerical data but also coincided with our iniolce of optimum 
searching approach as discussed in an earlier section. 

In the searches used for this section of the study, two or th^ue facets 
were used in one intersection. In some of tlie searches, ' several alternative 
set intersections were made; one was chosen for the study. 

In most of the three-facet searches there was a/ facet representing age 
or grade level; such facets were handled by using established saved searches 
where they existed. Since SEAItCH SAVE returns a complete, merged set 'of term 
postings, it was not meaningful to break the set apart in order to treat the 
individual terms Incrementally. A searcher would not benefit by breaking apart 
sets returned by a SEARCH SAVE; If the sets were not appropriate the searcher 
should, at another time, create a separate saved search. For this reason we 
did not Incltide SEARCH SAVE facets in the Incremental treatment. One or two 
searches had grade level facets which vere not already the subject of saved, 
searches. These facets were Incremented. 

The searches were completed J|.n the eidiaustive manner, and results were 
sent to the users for relevance Judgments. Relevance Judgments were received 
In several different forms, ranging from a binary yesrno, through a "new" vs. 
"already seen" and "potentially useful" vs. "not worth looking into" Judgment, 
to a 1-4 scale of relevance. From these Judgments a binary rating was extracted, 
incorporating the "most relevant" citations, whether previously seen or new to 
the requestor. 

Relevance is a Judgment as to pertinence to an information need,' even- 
tually as perceived by the requestor of a search. Precision may be considered 
as the ratio of the relevant retrieved* citations to the total set of retrieved 
citations. Recall may be considered as the ratio of relevant citations re- 
trieved to the total set of relevant citations in the file (which- Is generally 
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unki^iwii) • ^ For an in-depth discussion of these and %>t1her iseasures^ . see Lancaster's 
Infonaatioh Retrieval Systems: Charactferiatics^ Testing and Evaliiation » ^ 
Another excellent discussion may be foiind in King and Bryant's The Evaluation V / 
of Information Services and Products *^? 

Several of the searches achieved quite low precision figures. Iii some 
caises this is probably attributable to the difficulty of matching the search 
question to tiie data base; in some cases to insufficient e^qperlence on the 
p^rt of the searcher^t resulting in ralssion of iiq>ortant concepts which should 
have narrowed the 8eait:h; in other cases it probably reflects the use to which 
the material was to be put* Ph.D. research is apt to be concerned with theo«» 
ret ical: rather than practical aspects^ and a good many of the retrieved cita«* • 
tioa8;^reflected a ''how to do it" approach which was not ^f Interest several \ 
of thfes^ users. ; 

• ■ .. • : . ^- ■ 

After the ejliaustlve run had beenxmade^ the searches were run agaln^ 
this time selecting terms for each facet Inr decreasing order of the nuodier of 
postings. 



1!hen an iterative "brocess began. Taking the most heavily posted term 
from each facet (except the grade-level facet whicH was usually expressed by 
'a SEARCH .3A^4 ^ahd wa^i^not changed in any way), we^conibin^, these uaingrAND 
logic, and'ptinted the resulting set (using format 1, for brevity), e.g^^ 

C 1*11 

* ■ In this and the following exai^)les *'C** indicates the COMBINE operation, 
***** indicates the AND operation, while **•«•** indicates the OR operation. 

Then we took the first two most heavily-posted terms from each facet, 
combined them, and printed the resulting set* 

C (1+2)*(11+12) 

This process was continued untfl the full set of citatiqns retrieved 
by the original, exhaustive search was reached. . 

A three- f^cet search witfi one facet represented by a SEARCH SAVE was 
treated as follows: 

C 1^11*21 (where set 21 represents a saved search) 

C (1+2)*(11+12)*21 

C (142+3)* (11+12+13) *21 

An example of an incremental search is given in Figure 26 . 

Sixteen incremental searches were performed, and fourteen of them 
were used for this study. We obtained some very broad questions, and found 
that in these cases we were working very near to the limits set for the 
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Example of an Incremental Search 



DIALOG system, both In terns of number of sets used (DIAIjOG allows 98 sets) 
and in terms of number of postings involved in the sets created. In the 
Incremental searches we were constantly creating new sets; if several of these 
jsets had ^thousands of postings, disk spiice was used at a terrific rate; thus 
three of the broad searches came to an impasse with the^ssage "DISK STORAGE 
OVEBFLOW**. One of the searches was nearly complete (it had achieved 97% of 
the retrieval fn>m the eidiaustlve set and there was only one more term to 
increment) , and is ittclude<f in this study as if it were actually a completed 
seairch. .The incremental searches for. two others could not be completed because 
they ran into the system limitations. 



Topics fot the' searches run are given in Appendix B. 

' . . -. cs' ■ . ■ ■-" ' 

Table 20 Shavu the Incremental effect^) in terms of. the nuapber of cita- 
tions retrieved t of adding one more term to each facet* Remeober that <^li of 
the tenis have' beeii added in order of decrc£asing frequency of postings* Terms 
added without increasing the amount of output are also shbim* This same data 
is illustrated in Figure 27. Note that in eight out of 14 searches^ from one 
to ten terms were used which had no incremental effect on the output retrieved* 
because the output had ^ready been retrieved by other terms (with higher 
"^^iK>8 tings) e Hany of these terms which had no effect on the output retrieved 

were- infrequently-used Identifiers, or had spelling, punctuation, or spacing 
' errors* The results noted here are probably largely due to the redundancy 
and overlap of.^4lated terms used by ERIC indexers. 




_ Table 20 and figure 28 show the percentage of the total output citations 

(combined relevant and ndh"«.elevant) retrieved at each step. Using just the 
four most heavily-posted terms^^r facet, all but three searches had achieved 
more than 50Z of ^the exhaustive output,: and eight out of the 14 had achieved 
70Z or more of the full output* With^^en terms per facet, over 96% of thfi 
citations had been retrieved in all^l4 searches* 



Table 20 and Figure 29 show the percentage of the relevant citations 
retrieved at each incremental step* Using Just the four most heavily-posted 
terms, ten searches had achieved 50% or more the output Judged relevant* 
With ten te« ^ per facet, all but one sea:rch had retrieved 92% or more of the 
output*^ Judged relevant, but two searches did not achieve the last relevant 
citation tsitil the thirteenth term was reached* The fact that two searches 
required the thirteenth term (in rank order by number of postings) for comple- 
tion of the set of relevant citations indicates that specific terms with low 
postings may sometimes be important to a search, and gives warning that 
searchers must T!ot rely only on frequency of postings for information value^ ' 
For a discussion of the inverse relationship between information value and 
frequency of term assigiiment, see Tell's The Use of ERIC Tapes in Scandinavia^ ^ 
and Williams* ''Functions of a Man-Machine Interactive Retrieval System"* 19 



From the results of this study we conclude that for ERIC sear^ingj 
if exhaustivity is a requirement, there seems to be very little to be gained ^ 
by using more than 10 pf the most heavily-posted terms per facet* (0€ course ^ 
some facets will be completely satisfied by less than 10 terms*) 
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Fig, 27, Total Number of Cf.tations Retrieved for the Incremental Searches 



For fairly eadiaustlve searches with two facets » tiien» It lar possible 
■that up to 20 terms would be'iieeded, «whl|Le for three facets up to 30 terns 
would needed to retrieve most of the jrelevant citation^. These are, of 
course, merely guidelines, and not absoljtite figures. / 

■•!■*■ / • • ■ ; 

' This does not- Man that relevant Items in excess of /lO should be • 
routinely ignored, nor that selection o0 terms should be ii^de solely on the 
basis of ntfliber of postings. Indeed, tHe choice^ 6f ter^^ must be made by 
meaning, not ^y nuaber of postings. / 

/ /' - • ! 

. / / . ( 

O^ur study does indicate, hoy^ver, that the extra effort of searching • 
fdr nis filings, or peripherally-relevant, terms 1^ not productive, Thxs 
finding agrees with the data on variant forms th^ was reported in an earlier.. 
^cHOTrof~thi8-Teportl Centra^y relevant^ Oescfrlptort or Identifiers, | , 

regardless of their nuiBber of pdstlngs^ should certainly be included in th^ 
search statement. . ■> ^ 

Terms with, very few postings will seldom have., an iopact if the centtal; 
Thesaurus ten»* appropriate to the search are used. Terms with few postings j 
should probably only be chosen if they /kre specifically pertinent to the seaifch- 
topic. 

b.^ Methods of. Limiting Ouantltv of Output and Their Effect on Relevance j 

Near the end of an on-line search on a° given questlcm, one is sometimps 
confronted with a "too large" set of output citations'. Aside from changes i|i 
the search facets .or terM from a subject point of view, the general procedure 
followed to reduce the size of such final output sets is to use one of the 
LIMIT options available with MALOf?. In an earlier section of this report, the 
analysis of frequency of tise of cuamands by terminal operator^ shows that the 
LIMIT cosBumd is used 0.8A times per question, accounting for 3.40Z of the 
ponnands used per question. 

In an effort to Idei^tify factors that might be heljpful in limiting the 
selected output, we investigated the effect that several different kinds of 
limiting factors would have on the ntmiber of citations retrieved, and on the 
percentage of relevant citations (from the eschaustive set) retrieved. 

(1) Limit by LIMIT/MAJ CoBPand 

In the miC system, major applicability of a given Descriptor or Iden- 
tifier to a given document Is indicated by the presence of an asterisk preceding 
that term. In DIALOG, major value is represented by the MAJOR sub-command of 
the LIMIT coonand. The Suamary of Significant Rules In the ERIC Processing 
Itenualp specifies that "Major Descriptors (identified by a preceding asterisk) 
are limited to five (S) per document. The maximum ntaaber of Descriptors is 
not limited but will depend on the nature of the document. Major Identifiers 
are limited to one (1) per document. The maximum nusd>er of Identifiers, is not 
limited." In practice, 10 to 12 Descriptors are typically used In ERIC In- 
dexing^ while dne to three Identifiers may be present* 
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The LIMIT/MAJ conanand allows the searcher to reduce an output set by 
specifying that it must contain only Descriptors or Identifiers designated by 
the ERIC indexers as having major value. As approximately half of the terms 
for a glyen citation may be 'major terms, the yield for a given facet could be 
expected"^ to be cut in half use of the LIMIT/MAJ command* .An Intersected 
set could be expected to be greatly reduced by this method* 

The LIMIT/MAJ command returns a set satisfied only by Descriptors or 
Identifiers marked by an asterisk (ERICAS way of indicating major applicability 
'of this Descriptor to this document). 

We were interested to see whether the use of the LIMIT/MAJ command re- 
sulted in an output contai^ning relatively more of the* relevant citations (higher 
precision than the exhaustive set), or whether a decrease in output would be 
coupled with a proportionate decrease in the nusd>er of relevant Citations (same 
precision) • . ' 

When used on an intersected set, the LIMIT/MAJ command, as implemented 
at the time of the study, required at le^st one term in eacl intersected facet 
to have MAJOR value (i*e., to carry an asterisk). In a three-way intersection, 
this can be very restrictive because some of the facets may not include a major 
term* 

The Lockheed DIALOG Terminal Users Reference Manua i^ suggests, "Use the 
limit command oh key conceptual terms so that only major Descriptors will be 
selected* It is usually best to limit |\ndlvldual terms rather than sets 
resulting from the combinatiQn of termfil|.^ 

We investigated the LIMIT/MAJ coipm^d along these lines, using this 
command on the ORed sets of terms making upvthe "key" concepts, rather than 
on "key conceptual terms" alone. However, in ^me searches it was difficult 
to dejcide which concept should be considered the key concept, so we used the 
LIMIT/MAJ command separately on each. facet except the grade level* This 
parallels the treatment, of the grade level facet in the incremental searches* 

This task was carried out at the same time as the incremental searches 
described" la an earlier section of this report. Having obtained the exhaustive, 
sets for each of the l4 searches, we proceeded to use DIALOG'S LIMIT command^ 
and other .methods to cut down on the quantity of output. 

Eleven of the 14 searches described in the earlier section were used 
for this study • Of the three searches not included, one triggered "DISK 
STORAGt: OVERFLOW", and because of a searcher error the output sets for two 
were not printed. 

The LIMIT/MAJ command was used on each facet except the grade level 
facet. Each "MAJOR" facet was then intersected with its partner facet in 
unlimited form, e.g*, th^ partner set could contain asterisked or non- 
asterisked (MAJOR or MINOR) terms. The results of these intersections were 
ORed together, giving a set where a MAJOR term from either but not necessarily 
both sets was present « This ORed set corresponds to the way the LIMIT/MAJ 
conmahd operated on an intersected set in an earlier version of DIALOG. 

. . • . i- i i 
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These sets were then compared with each other and with the set result- 
ing from using the LIMIT/MAJ command on the previously intersected set, in 
teniB of relevant citations retrieved. The percentage of the relevant cita- 
tions from the exhaustive set wa^ determined. 

An example of the way in which sets were limited and c«obined is given 
in Figured 30. The results of this (Experimental work are described in a later 
section of this report. ^ /\ 

(2) Limit by Accession Number 

Although DIALOG implements/ the accession nuwber range aa one parameter 
of the LIMIT command, for expediency this operation was done manually from the 
t>utput of the esdiaustive set. W6 feel that, whereas the ability to limit by 
time is very Important, using aa accession number range as presently required 
under LIMIT command is an awkward means by which to achieve this effect. One 
must look up the desired accessiou numbers in a table printed for the ERIC 
Chronolog ; one must also LIMIt/ed numbers and EJ nunibers separately, because 
they are separate series of ntbbers. It would be far easier to use a year- 
month parameter for accession range limitation; such a parameter would have 
the additional gr-eat advantage of being useful across databases. 

In' fact, RIE issue numbers are already stored as an Inyert-jd file 
(Figure 31) . If this file )Were made numerical and stored as YYMM it would 
be much more Useful. / 

It would be very h4lpful if Lockheed would implement a YYMM chrono- 
logical feature as one of the options available with the LIMIT command, in- 
dependent of the data babe being searched. The accession nuinbcr prefix (ED, 
EJ) could still be accented as a modifier when appropriate. 

* (3) Limit by Ptinting Only the First N Citations 

f 

Several installations have used this type of output limiting; we tested 
how this method affetited the number of relevant citations. Seven of our lA 
searches produced taqre than 100 Citations; using a limit of 100 citations the 
percentage of relevant citations was ^ calculated for six of these searches. 

Three researchers for whom searches were done during this study 
mentioned finding/more "good" citations towards the front of the output. 
This may reflect /greater timeliness (the newest citations are always printed 
first, unless th4 searcher requests another sorting sequence), or better 
acquisition eff/brts by the ERIC clearinghouses in more recent times, or 
greater applicability of CUE citations (which appear at the front of the 
output) to ^ertx-v^Staxdn. 

(A) Resul-ts of All Attempts to Limit Output 

Using the LIMIT/MAJ command on a previously intersected set , an average 
of only 33*^ of the relevant citations was retrieved from an average of 27% of 
the total /Citations. This seems to be a too-radical Holutlon to the probU'in 
of too much output. 
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SBABCB BISTOBT 

SET ISEllS DESCBXPSXOil 

1 287 ITeDBAF 

2 105 IT^DBAF BDBCATIOM 

3 98 IT^BAP CBIlOBSai 
n 73 ZT»LICBBADIN6 

5 69 IT»liAN0AL COBBOBICATION 

6 69 iTsSX6ir .L19G0A6B 

7 93 IT»DBAP BESSABCB 

8 33 lT=sFZH6B!t SPBLLIH6 

9 21 ZT»7ISIBLB SPBBCB 

10 18 IT»D£AF ZBXE^PBETXHG 

11 1365 IXsFILHS 

12 1016 ZT»VIX>J»)^ TAPE BBCOB0ZH6S 

13 871 IT*TBLBVISIOi 

14 825 IT^ZBSTBOCTZOH AL TELBVZSIOii 

15 640 ZT::«BLB7ISB0 IBSTBOCTIOH 

16 579 IT^^ZNSTBOCrZONAL FItBS - 

17 391 IT^CtOSBD CIBCOIt XElBVISIOB 

18 365 XT^FZLBStBZPS 

19 343 ITaFItB STODT 

20 337 IT=KBi>IA TBCBNOLOGT 

21 283 XT-PBOGBABXIIG fBItOADCAST) 

22 265 XTs^ELBVXSIOB VIES IB6 

23 64 XT»VIDEO CASSEXXB STSXERS 

24 62 IT«SI»GLB COKCBPT FltfiS 

25 42 IT=0P2H CIHCBIT TB1B7ISI0B 

26 39 IT«AlliaAXIO» 

27 5 IT=TELBVXSION XHSTBUCXXOS 

28 567 1*2*3*4*5*6*7*8*9*10 

29 5264 11+12*13*14*15*16*mi8*19*20^21 

30 28 26*29 



46 311 28/fl£J ) 

47 3285 29/nJiJ ( L.^^lr^ftr 

48 8 30/KAJ ^ 

49 12 464^29 



53 .13 47*28 

54 17 53+49 



Figure 30. Example of Use of LIMIT/MAJ on 
Different Set Combinations 
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Fig. 31. ExampU of RIE Issue Numbers 
Stored as an Inverted File 
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^ The results of LIMITlng one facet to MAJOR and intersecting the resul- 

tant set with' its partner set in full form producer better results in most 
cases. An average of 60% of the relevant citations was retrieved from an 
average of. 53% of the total citations. However, the best use of LIMIT/MAJ 
for these searches appears to be the ORed set of intersections of one MAJOR 
facet with a partner non-major facet, and vice versa. In the study, this 
method retrieved an average of 85% of the relevant citations, from an average 
of '75% of tha total citations. Xn nine of the 11 searches studied, this ORed 
set retrieved a hi^er percentage of the relevant citations than of total 
citations. None of the other possible ways to use ^LIHIT^MAJ appears^ to be ^ 
as successiEul* It must be noted, however, that the ORed set we are speaking 
of does^not r^dtitce the total number of citations retrieved- as effectively as 
some of the other versions. 

We feel that this data indicates that it would be helpful if another 
type of LIMIT/MAJ comnand ware implemented, which would retrieve items having 

MAJOR posting in either (any) facet of an Intersected set. This commands 
8h^>Mld not replace the present LIMIT command, but should slaqily provide 
another option. A further option which would be very useful in cases where 
requ^tors want output only from a given time period would be the provision 
of a LIMIT /YYMM-TYMM feature. This could augment or replace the present 
LIMIT by accession number feature. 

As to reconmendations to searchers, no hard and fast rules can be 
given. We have attempted to provide a "menu" of possible methods of reducing 
output quantity, and must leave it to the discretion bf the individual searchers 
which method they choose. 
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APPENDIX A 



con w«>j*«« 



Tfcble I. Ghponologlcal Hecord of Searches Psrformed by All Searchers, 



Oot^^^lB torough November 1. 



Searcher 

JT 
AH 
~AH 
JT 
JT 
AH 
AH 
JT . 
JT 
AH 
JT 
JR 
JT 
AH 
JR 
JT 
JR 
JT 
AH 
AH 
JT 
JT 



JT 
JT 
DT 
DT 
JT 
JT 
DT 
JT 
JT 
AH 
AH 
JT 
JT 
AH 
AH 
AH 
JR 



Day t 
Thurs. 
,Oct. 18 



Thurs. 
Oct, 25 



II 
II 

fi 

II 
•I 



%m at Start 
ctf Search 

5i36 

6il4 
6td$ 
6i48 
7i07 
7i29 • 

dii3 
8130 
6152 

9«17 
9132 
.9150 
10.25 

II1I5 
lli45 

11«55 
12tl8 

I2i35 
12i55 
lt08 



5«25 
5tko 

5«55 
61I7 
6150 

7132 
8i20 

10 1 00 

10I15 

10|40 
lljOl 

11I36 

12il8 
12136 
12i5l 
ItlO 



^Ine at Bind. 
of Search 

5135 
5«55 
61II 
6t30 
6ik5 
7i06 
7i26 
7tk5 
8i29 . 

8151 
9il0 

9i35 

9i50- 

10.08 

lltl5 
lli3'f 

12 1 00 
I2il6 

12135 

I2i50 
I1O6 ' 
li20 



5«38 
5«55 
61I9 
6ifl 
7i02 

71^5 
8iif3 
9i55 
IO1I5 
l0O9 
10 1 56 
111 16 
1^15 

l?i5l 

rio6 

1»30 



fillg^sed Tine Reported 
by DIALOG (Minutea)t. 

45.98^ 
15.56 
16,23 
IkM 
1^.39 
17,16 
. I8.03 

15.^)8 

19.66 

16.83 

16.00 

15.80 

17.19 

31.^3* 

16,56 

17.35 

19.58 m 

^ 12.18 
12.12 



13.^1 
1^.05 
25.17 
17.1-^ 
13.-^8 
12.06 
16.90 
13. S5 

13.<^3 
20, as 

15.1<J 

1^.87 

13.92* 

16.06 

14.46 

14.29 
18.61 



♦Indicates that the system was down during part of this search. 
—♦The elapsed time reported by :>iALCG on-line at the end of each search. 
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. ?&ble I (continued) 









Time at Start 


Tine at s!hd 


SlApsid Time Reported 






Searehar 




of Search 


of Search 


by DIALCX: (Minutes; 


"i* 




/ 


FtTT 








« -- 


■ 1 




Oct. 26 


5ilO 


5i22 


11.02 






JT V 


t 


5i25 


6136 


11.88 ^i^'"- 






AH *\ 


M 


5i36 


5i50 


13.28 


• .■■.7 




AH 




5t50 


6108 


16.05 


^ ■•■ 




JT 




6tl0 


6125 


13.9^ 






JT 


H 


6t26 


6i40 


12.37 






AH 




6i38 


6152 


12.97 






JT . 


H 


6i56 


7ilo 


13.09 


.A 




JT , 




7itl 


7i25 


I3.I8 






AH' 


H 


7t25 


7i39 - 


13.5^ ' 






JT 


•f 


. 81O5 


8t45 


25.61* 






AH 


N 


8152 


9i08 


15.36 


■ f 




JH 


•t 


9il6 


9i3^ f 


16. 7** 


- - .-4 





JR 






9i50 


13.21 


. =. 


.... 


JR 




9i51 


IO107 


I6.3I 


■ ^--^ 




JT 


• 


lOilO 


IO136 


1'*.37 


■ e; 




JT. 


ft 


IO130 


101^4.9 


15.62 






JT 


•f 


IO150 


II1O8 


17.1^* 




. 1- .-». 


AH 


W 


II1O8 


II13I 


21.68 






AH 


tl 












JR 


91 




I2i20 


18.2^* 






JT 


H 


12i25 


12i40 


14.12 


# "7" 


• 


JT 




12i45 


12 J 59 


13.^ 




a . 


JT 


^ ft 

Mon • 


'liOO 


I1I5 


16.27 


»- . 




JR 


cct. 29 


5I15 


\ 5i27 


10.51 






JR 




5i27 


5 1^0 


UiOl 






JR 


w 


51^*0 


\ 5152 


' 11 167 






JR 


n 


5i53 


61IO 


14.52 






JR 


•1 


6tl2 


^ 6i24 


13.34 






JR 


ft 


6130 




. 12.59 






JR 






6158 


12.89 ^ 


■ » 


r- ■■ ■ - 


JR 


fi 


6158^ 


71I3 


12.23 




. ■ ■ ■ * 


JR • ' 


•1 


7il^ 


7i26 


13.76 




- w . 


JR 




8iG7 


8t2l 


11.65 






JR 




8i22 


^ 8135 


12.16 






JR 


•1 


8135 


8i49 • 


13.24 






JR 


%9 


8150 


9i03 


13.10 






JR 


99 


9i05 


91I7 


11.59 






JR ' 


99 


9i24 


9i39 


14.87 






JT 


99 




IO1I6 


14.80* 






JT 


99 


IO1I6 


IO13I 


13.71 




• 


JT . 


99 


IO132 


IO158 


23.2*9* 






JT 


W 


lliOO 


II1I2 


12.47 






JR 


•t 


lli28 


lli45 


14.72 






JT 


99 . 


II158 


12il0 


12.70 






JT 


99 


12i20 


12tl'* 


13.55 






JT 


99 


12135 


12i50 


15.04 


1 




JT 


99 


I2i5l 


lilO 


18.37 






JT 


W 


lilO 


I125 


12.33 
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Table !• (continued) 



■SeMGher 



y DT 

DT 

DT 
DT 
DT 
Jr AH 
.AH 
DT 
AH 
AH 
• DT 

DT 

; V DT 
JR 
JR 
JR 
JR 
AH 
AH 



DT 
DT 
DT 
!» 
AH 
AH 
AH 
DT 
DT 
JR 
JR 
JR 
JR 
JR 
DT 
DT 

JR 
JR 
JT 
JR 
JR 
JT 
JT 
JR 
JR 
JT 
JT 
JT 
DT 
DT 
DT 

or 



D^ 
Tues, 
Oct. 30 

• ee - ' 
tf 

*cy »♦', -le - , 



N 

n 

N 




Wed« 
Oct. 31 



k DT 



Thurs, 
Nov. 1 



M 



Tloe at Start 
of Search 

-5iOO 



? 

7i25- 

8i32 * 

8156 
9il2 
9i'>3 
. IO1O3 
IO137 
II152 
1'2:13 
12i33 
12i59 

5i39 
61OI 

61I3 

6i55 

7i20 

lOiOl 

IO125 

10i45 

lliOO 

till? 

11 1 33 
llf50 
I2i06 

^•32 

5i30 

6109 
6i25 
6 1*^5 
7i05 
7 120 
81OO 
81I5 
8130 

8i<f5 
9i20 

9i35 
91*^7 
lOiOl 



'1, 



Tine at End 
of Search 

5.31 

5i55 
61Q8 

6i22 

6i37 

5i52 

7i09 

7i2<^ 

7i38 
8il4 

8130 

8|i|4 

8156 

9i02 

, 9i26 

: IO1O2 

lilies 

12il2 
12i32 

lil7 

5i55 
6il7 
6i40 

7 115 
7i37 - 
I0i2tf 

, UiOO 
lli25 

111 32 
11 150 

12I05 

I2i28 
I1I2 

lOU 
li43 

5i30 

5i^3 
61OO 
6125 
6i40 
7iOO 
7il8 

705 

81I3 

8i3P, 

8i<>5 

9tOO 

9i'>7 
/ lOiOO 



Slapeed Ties Reported 
by DIALOG (Minuteej 

,96 
.90 
.03 

•17 
.32 

*>,27 
3.09 
12,08 

1.93 , 
3.83 
1.73 

0.48 ; 
.4.07 
2.30 

1.82 
.0.87 
5.78 
,8,46 
8.29 
9.49 
.6.60 



5.09,. 
5.37 
4.81 
6.18 
6.38 
20.57 
8.79 
4.25 

6.97 
3.^*9 

5,28 

3.97 
8.90 
21.89 
4.34 
1.69 

4.28 

2.21 • 

2.50 

5.04 

3.84 

4.99 

3.35 

5.23 

2.4i^ 

3.^0 
2.9^ 
• .80 
3.12 
1.14 
3.17 
0.92 



Table . 1 • ( c ont i nued ) 



Time at Startr' 



Tine at Knd 





Dav 


of Ss&rc h 






Thurs 






JT 


Nov. 1 




^ II1O8 


JT 


ft 


II1O8 


II12O 


JT 


•• 


lli20 


^ 11|40 


JR 


•t 


11 1^0 


i 11,57 


DT 


u , 


II158 


; I2ii4 


JT 




12t42 


: 12i56 


JR 


f» 


12i58 


' ltl2 


JT 


•• 


1»15 


lt26 



sllapsed Time Reported 
by DIALCJG (.'InuteGj 

11.87 

16.93* 

14.27 

16.77 

13.70 

13.12 

12,80 



\ 



/ 



T^ULe 2, Slpased Time for aach Search Arranged According to Searcher (ilxcluding Searches 
Dune When System Was Down), with Cumulative Mean for Sach Searcher. 



ieaxcheri 



AH 



JT 



JR 



Thuxs, 
Oet, 18 



Mean 

Elapsed 

Tiaet 



/ 



Thurs, 
Oct, 25 



Hean 

Glapjed 
Tioet 



Slapsed Cumula- Elapsed Gumula- Slapsed Cumula* :Jlapsed Cusula- 
Tines tive Mean Times tive Mean Tiaes tlve Mean Times tlve Mian 



15.56 
16.23 
17.16 
I8.03 
19.66 
17.19 
16.54 
12.44 



16.60 



20.86 
15.12 
16.06 
14.46 
14,29 



16.16 



55.56 
15.90 
16.32 

16,75 
17.33 
17:31 
17.20 
16,60 



17.07 
16,88 
16.80 
16.61 
16.4*1 



15.98 
14.44 

14.39 
14.C7 
; 16.88 
I3.B0 
16.56 
19.58 
12.18 
12.12 



11^ 



13.'«'1 
14.05 
13.28 
12.06 

13.85 
13.83 
14.87 



15.98 
15.21 

.1?.9^ 
14.92 

15.31 
15.39 
15.56 
16.06 

■15.63 
15.28 



15.11 
15.02 

14,89 
14.69 
14.63 
14.58 
14.6U 



16.00 
17.35 



16.68 



18.81 



18.81 



16.00 
16.68 



17.39 



25.17 
17.12 
16.90 



25.17 
21.15 
19.73 



Friday 
Oct, 26 



Mean 
Slapsed 
^^iasi 



13.28 


16.21 


11.02 


14.40 


16.74 


1^,23^ 


16.05 


16.20 


11.88 


14.26 


13.21 


/^6.42 


12.97 


15.99 


13.94 


14.25 


16.31/ 


/ 16.40 


13.54 


15.85 


12.37 


14.16 


13.24^ 


16.67 


15.36 


15.82 


13.09 


14.11 


/ 




21.68 


I6.I3 


13.18 


14.07 




18.07 


16.23 


14.37 


14.08 


I 








15.62 


14.14 








\ 


17.14 


14.26 


\ 

\ 






14.12 

13.^ 
16.27 


14.25 
14.22 

14.30 . 


1 





11,87 
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Taibl^2 , (Continued) 



Searehert ^ 
Oct. 29 



. AH . 

Slapsed Cumula- 
Times tlve Kean 



?!ean 3:iapsed 
Tiinei 



JT 



Times tive Kean 



13.71 
12.47 
12.70 

13.55 
15.U4 
18.37 
12.33 



14.28 
11.01 
14.17 
14,15 
14.18 
14.-30 
14.24 



ja 



14.02 



iilapsed 
Times 

10.51 
15.35 
11.67 
1^^,52 
13.3^ 
12.59 
12.3ft 
12.23 
13.76 
11.65 
•12.16 
13.24 
13.10 
11.59 
14.87 
14.72 

12.74 



Oumula- 
tive Kiean 

15.90 

14.59 
14.94 
14.81 
14.64 
.14.51 
14.36 
14.32 

14,17 
14.06 
14.01 
13.97 
. 13.85 
13.90 
13.94 



3T 



Elapsed 
Times 



Cumula- 
tive Hean 



Tues. 
Oct. 30 



elapsed 
Timet 



14.27 
13.09 
11.93 
13.33 
18.46 

18.29 



14^ 



16. 1> 
16. .0 

15.--^ 
15.74 

15.^-5 
15. S4 



Wed. 
Oct. 31 



16.38 
20.57 
18.79 



15.96 
16.12 
16.^1 



Mean 
Timet 

ERIC 



IM8 



12.3b , 13.67 

XI 1.82 13.79 

1^0^87 13.67 

15.78, 13.75.. 



12^ 



14.96 
14.90 

1^3.03 
13.49 
12.17 
12.32 
12.43 
12.08 

11.73 
10.48 

14.07 
19.49 
16.60 



18.54 

17.31- 
17.01 

16.51 
15.97 
15.56 
15.25 
.14.96 
14.69 
14.37 
14.35 
14.69 
14.81 



13.49 
15.28 

13.97 
18.90 



13,74 
13.79 
13.8) 
13.97 



.15.09 
15.37 

\14.81 
16.18 

14^.2:, 

16.97 
14.34 

11.69\ 



14.83 
14.86 
14.85 
14.52 
14.89 
14.93 

14.95 
14.82 



14.84 



ggst on 

T^bXe 2, (Continued) 

Searcher* AH JT ^ 

Thurs aiapsed Gunula- Jlapsed Cumula- Elapsed Gumula- ^lapsed uumu^--. 

Nov. 1 Times tive Mea n Times tive Itean Times tive Ivean Times live Kean 

12.50 1'*.20 14.28 13.98 11.14 14.67 



14.99 
13.35 



Kean 



Slasped ^ " 
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Cumula- 
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tive Mean 


Times 


tive I'fean 


Times 


14.20 
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11.14 
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14.17 
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. 13.95 
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13.99 
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14.07 


13.12 


13.93 




14.06 








14.03 




• 






13,80 




13.00 



14.61 
14.48 



13.40 14.17 13.84 . 13.95 16.77 14.56 
12.94 
14.80 

12.33 
•11.87 
13.70 
12.80 



Sui / Statistics! 
Searchers AH 



JT Ja ■ 3T All 



Total Nuialjer of 2ft 142 

Searches Completedt 29 ^ 39 28 

SScSfsf' \l6)l 1^.03 13.93 1^.5^ 1^.55 

^^'dIv?^;, " 2.5336 1.S774 2.I652 3.-^592 
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/ RECeiT ILR PUBLICATIONS 

/ PuUicatlons of papers aiid r«»p&rt8 of interest to acholars and practitioners in 
the field of library and information science is an impoi-tant function of the Institute 
jot UUrary Research. In addition to this study, the following have been published 
/ recently by ILR. 

ILR-73-001 Todd. Judy, Summary Report of Student. Studies of Sub.leet Headings 

Used in the Univernity of California. BcrK-ra f^y Suh,]ect Catiilog 
(July 1973) a pp. (ERIC No. ED-082 775) 

II«-73-002 Bourne. Charles P., and Jo Robinson, SDI Citotion Cheekinp; as a Measuy^ 

of the Performance of Library Docum eHt Delivery Systems (July 1973) 
10 pp. (ERIC No. ED-082 77U) 

ILH-73-003 ^.eeks, Kenneth, Determination of Pre.A C Quisition Predictors of Book Use; 

Final Report (July 1973^ 20 PP> ^^RIC No. ED-ob2 il^ 

ItR-73-OOU Weeks, Kenneth, Pmposai for a University of Califomia/Califomia State?, 
University and Colleges Inter-Sofonental Machine Readable Library 
Patron Card (August 1973) 21 pp. (ERIC No, ED-o82 777) 

ILR-73-005 LeDonne, Marjorle, "Summary of Court Decisions Relating to the Provision 
of Library Services in Correctional Institutions, Associatltai 
of Hospital and Institution Libraries ftuarterly (Winter /Spring 
1973) 9 pp. 

ILR-73-006 Thelin. John, and Bonnie F. Shaw, (editors). Institute of Library Research 
. Annual Report: .Tniv 1072 to June 1973 (September 1973) 30 PP. 

ILR-73-007 Dekleva, Borut. Uniform Slavic Translit eration Alphabet (TJ3TA), 

(October 1973) ^2 pp. {ERIC No. ED-086 \6«»J 

ILR-73-008 LeDonne. Marjorie. Findings snd Recommendations. Volume I. |2S^^ 
Library and Information Problems in Correctional Institutions 
(January 197U) 88 pp. \ 

ILR-73-009 LeDonne. Marjorie. Access to Legal Reference Materials in Correctional 

""S' ^ I^titutions. Volume II. o ^ »f TM>,r.,.ry nna Information 

Problems in Correctional Institutions (Januiiry 197«») ?P. 

ILR-73-010 LeDonse. rJar.jorie. David Christ iano. and Jane S'^^ntlebury^urrent 
ILM fj-uiu p;actices ii Correctional Ubrary Services: State ?rofllea. 

Volume III , Survey of Library and Information Problems in 
CorrBctional Institutions (January m**) o8 Pp. 

ILR-73^ LeDonne. Marjorle, David Christiano. and Joan Stout, Biblfogr^hy. 
•■'^ Volume IV, Survey of Library and I *.fc.i^ftt.ion- Probl&aa in 

Correctional Institutions (January 197^) 28 pp. 

IM-73-012 OregCT, Dorothy. Feasibility of Cooperative noiiggting c.f Btotlc 
ILB-73-012 ^^''g^^>^^,p^"^;„ p^^^^ ge^^^ Titles among H^ait.v/ Seienoes Librarlea 

in Califcmla (February 197'») t** PP* 

TLR-7U-0C1 Noaik. Barbara. Th^ Use Status of P'-cha Requested from ^^^^^ ^''^f ^ 
' of Calif or^TR . Be rkeley. Inter-Library Loan (March 197U) U pp. 

IU»-7U-002 Bourne. Charles P., Institute of Ubrary Rfw^aroh Annual Report; 

Julv 1973 to June X9l\ (l97i') 25- PP- * appendicca 

ILR.7U-O03 Humphrey. Allan J., Survey of Sel °''*'»^i Ttmt.ailationfl Actively Searching 
ILR-7«»-003 nmpt ^^v,^ MiJietlTfeTata Baae in Batch Mode. Volume I (June 

1973) 86 pp. 

TTi»-7k-ooU Cooper. WlUiaxB S., Donald T. Thompson, and Kenneth R. Sfef. 
ILR.7'M)0U ^''^'' CS^T^r.r:l.f Monograph ? r.^Mr^a. in the UniversUv of CaUfomls 

Library System (October 197^) 32 PP. 
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